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RELATED APPLICATION 

This application is a continuation in part of Serial No. 09/06 1 ,709 filed April 1 7, 1 998, 
incorporated by reference. 
FIELD OF THE INVENTION 

This invention relates to antig&iis associated with cancer, the nucleic acid molecules 
encoding them, as well as the uses of these. 
BACKGROUND AND PRIOR ART 

It is fairly well established that many pathological conditions, such as infections, cancer, 
autoimmune disorders, etc., are characterized by the inappropriate expression of certain 
V : molecules. These molecules thus serve as "markers" for a particular pathological or abnormal 

f ; condition. Apart from their use as diagnostic '"targets", i.e., materials to be identified to diagnose 

these abnormal conditions, the molecules serve as reagents which can be used to generate 
diagnostic and/or therapeutic agents. A by no means limiting example of this is the use of cancer 
markers to produce antibodies specific to a particular marker. Yet another non-limiting example 
is the use of a peptide which complexes with an MHC molecule, to generate cytolytic T cells 
against abnormal cells. 

Preparation of such materials, of course, presupposes a source of the reagents used to 
generate these. Purification from cells is one laborious, far from sure method of doing so. 
Another preferred method is the isolation of nucleic acid molecules which encode a particular 
marker, followed by the use of the isolated encoding molecule to express the desired molecule. 

Two basic strategies have been employed for the detection of such antigens, in e.g., 
human tumors. These will be referred to as the genetic approach and the biochemical approach. 
The genetic approach is exemplified by, e.g., dePlaen et al., Proc. Natl Sci. USA 85: 2275 
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(1988), incorporated by reference. In this approach, several hundred pools of plasmids of a 
cDNA library obtained from a tumor are tratisfected into recipient cells, such as COS cells, or 
into antigen-negative variants of tumor cell lines which are tested for the expression of the 
specific antigen. The biochemical approach, exemplified by, e.g., 0. Mandelboim, et ah, Nature 
369: 69 (1994) incorporated by reference, is based on acidic elution of peptides which have 
bound to MHC-class I molecules of tumor cells, followed by reversed-phase high performance 
liquid chromography (HPLC). Antigenic peptides are identified after they bind to empty MHC- 
class I molecules of mutant cell lines, defective in antigen processing, and induce specific 
reactions with cytotoxic T-lymphocytes. These reactions include induction of CTL proliferation, 
TNF release, and lysis of target cells, measurable in an MTT assay, or a 51 Cr release assay. 

These two approaches to the molecular definition of antigens have the following 
disadvantages: first, they are enormously cumbersome, time-consuming and expensive; and 
second, they depend on the establishment of cytotoxic T cell lines (CTLs) with predefined 
specificity. 

The problems inherent to the two known approaches for the identification and molecular 
definition of antigens is best demonstrated by the fact that both methods have, so far, succeeded 
in defining only very few new antigens in human tumors. See, e.g., van der Bruggen et aL, 
Science254: 1643-1647 (1991); Brichard et al., J. Exp. Med. 178: 489-495 (1993); Coulie,et 
al.,J. Exp. Med. 180: 35-42(1994); Kawakami, et al, Proc. Natl. Acad. Sci. USA 91: 3515- 
3519(1994). 

Further, the methodologies described rely on the availability of established, permanent 
cell lines of the cancer type under consideration. It is very difficult to establish cell lines from 
certain cancer types, as is shown by, e.g., Oettgen, et al, Immunol. Allerg. Clin. North. Am. 
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1 0: 607-637 (1 990). It is also known that some epithelial cell type cancers are poorly susceptible 
to CTLs in vitro, precluding routine analysis. These problems have stimulated the art to develop 
additional methodologies for identifying cancer associated antigens. 

One key methodology is described by Sahin, et al, Proc. Natl. Acad. Sci. USA 92: 
11810-11913 (1995), incorporated by reference. Also, see U.S. Patent No. 5,698,396, and 
Application Serial No. 08/479,328, filed on June 7, 1995 and January 3, 1996, respectively. All 
three of these references are incorporated by reference. To summarize, the method involves the 
expression of cDNA libraries in a prokaryotic host. (The libraries are secured from a tumor 
sample). The expressed libraries are then immunoscreened with absorbed and diluted sera, in 
order to detect those antigens which elicit high titer humoral responses. This methodology is 
known as the SEREX method ("Serological identification of antigens by Recombinant 
Expression Cloning"). The methodology has been employed to confirm expression of previously 
identified tumor associated antigens, as well as to detect new ones. See the above referenced 
patent applications and Sahin, et al., supra , as well as Crew, et al., EMBO J 144: 2333-2340 
(1995). 

This methodology has been applied to a range of tumor types, including those described 
by Sahin et al, supra, and Pfreundschuh, supra, as well as to esophageal cancer (Chen et al., 
Proc. Natl. Acad. Sci. USA 94: 1914-1918 (1997)); lung cancer (Giireetal., Cancer Res. 58: 
1034-1041 (1998)); colon cancer (Serial No. 08/948, 705 filed October 10, 1997) incorporated 
by reference, and so forth. Among the antigens identified via SEREX are the SSX2 molecule 
(Sahin et al., Proc. Natl. Acad. Sci. USA 92: 11810-11813 (1995); Tureci etal., Cancer Res. 
56: 4766-4772 (1996); NY-ESO-1 Chen, etal., Proc. Natl. Acad. Sci. USA 94: 1914-1918 
(1997); and SCP1 (Serial No. 08/892,705 filed July 15, 1997) incorporated by reference. 
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Analysis of SEREX identified antigens has shown overlap between SEREX defined and GTL 
defined antigens. MAGE-1, tyrosinase, and NY-ESO-1 have all been shown to be recognized 
by patient antibodies as well as CTLs, showing that humoral and cell mediated responses do act 
in concert. 

It is clear from this summary that identification of relevant antigens via SEREX is a 
desirable aim. The inventors have modified standard SEREX protocols and have screened a cell 
line known to be a good source of the antigens listed supra, using allogeneic patient sample. 
New antigens have been identified in this way and have been studied. Also, a previously known 
molecule has now been identified via SEREX techniques. 
1. DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

Example 1 

The melanoma cell referred to as SK-MEL-37 was used, because it has been shown to 
express a number of members of the CT antigen family, including MAGE-1 (Chen et al., Proc. 
Natl. Acad. Sci. USA 91: 1004-1008 (1994); NY-ESO-1 (Chen etal. Proc. Natl. Acad. Sci. 
USA 94: 1914-1918(1997)); and various members oftheSSX family (Gure etal., Int. J. Cancer 
72: 965-971 (1997)). 

Total RNA was extracted from cultured samples of SK-MEL-37 using standard methods, 
and this was then used to construct a cDNA library in commercially available, XZAP expression 
vector, following protocols provided by the manufacturer. The cDNA was then transfected into 
E. coli and screened, following Sahin et al., Proc. Natl. Acad. Sci. USA 92: 11810-11813 
(1995), incorporated by reference, and Pfreundschuh, U.S. Patent No. 5,698,396, also 
incorporated by reference. The screening was done with allogeneic patient serum "NW38." This 
serum had been shown, previously, to contain high titer antibodies against MAGE-1 and NY- 
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ESO-1. See, e.g., Jager et al., J. Exp. Med. 187: 265-270 (1998), incorporated by reference. 
In brief, serum was diluted 1:10, preabsorbed with lysates of transfected E. coli, further diluted 
to 1:2000, and then incubated overnight at room temperature with nitrocellulose membranes 
containing phage plagues, prepared in accordance with Sahin et al, and Pfreundschuh, supra . 
The library contained a total of 2.3x 1 0 7 primary clones. After washing, the filters were incubated 
with alkaline phosphatase conjugated, goat anti-human Fey secondary antibodies, and were then 
visualized by incubating with 5-bromo-4-chloro-3-indolyl phosphate, and nitroblue tetrazolium. 

After screening 1.5x10 s of the clones, a total of sixty-one positives had been identified. 
Given this number, screening was stopped, and the positive clones were subjected to further 
analysis. 

Example 2 

The positive clones identified in example 1 , supra, were purified, the inserts were excised 
in vitro, and inserted into a commercially available plasmid, pBK-CMV, and then evaluated on 
the basis of restriction mapping with EcoRI and Xbal. Clones which represented different inserts 
on the basis of this step were sequenced, using standard methodologies. 

There was a group of 10 clones, which could not be classified other than as 
"miscellaneous genes", in that they did not seem to belong to any particular family. They 
consisted of 9 distinct genes, of which four were known, and five were new. The fifty one 
remaining clones were classified into four groups. The data are presented in Tables 1 and 2, 
which follow. 

The largest group are genes related to KOC ("KH-domain containing gene, overexpressed 
in cancer" which has been shown to be overexpressed in pancreatic cancer, and maps to 
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chromosome 7pl 1.5. See Mfleller-Pillasch et al., Oncogene 14: 2729-2733 (1997). Two of the 
33 were derived from the KOC gene, and the other 31 were derived from two previously 
unidentified, but related genes. Examples 6 et seq. describe work on this group of clones. 

Eleven clones, i.e., Group 2, were MAGE sequences. Four were derived from MAGE-4a, 
taught by DePlaen et al., Immunogenetics 40: 360-369, Genbank U10687, while the other 7 
hybridized to a MAGE-4a probe, derived from the 5' sequence, suggesting they belong to the 
MAGE family. 

The third group consisted of five clones of the NY-ESO-1 family. Two were identical 
to the gene described by Chen etal.,Proc. Natl. Acad. Sci. USA 94: 1914-1918 (1997), and 
in Serial No. 08/725, 1 82, filed October 3, 1 996, incorporated by reference. The other three were 
derived from a second member of the NY-ESO-1 family, i.e., LAGE-1. See U.S. application 
Serial No. 08/791,495, filed January 27, 1997 and incorporated by reference. 

The fourth, and final group, related to a novel gene referred to as CT7. This gene, the 
sequence of which is presented as SEQ ID NO: 1, was studied further. 
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Table 1. SEREX-identified genes from allogeneic screening of SK-MEL-37 library 



Gene group 


# of clones 


Comments 


KOC 


33 


derived from 3 related genes 


MAGE 


11 


predominantly MAGE-4a (see text) 


NY-ESO-1 


5 


derived from 2 related genes (NY-ESO-1, LAGE-1) 


CT7 


2 


new cancer/testis antigen 


Miscellaneous 


10 


see Table 2 



Table 2. SEREX-identified genes from allogeneic screening of SK-MEL-37 library- 
Miscellaneous group 



Clone designation 



Gene 



MNW-4, MNW-7 

MNE-6a 

MNW-24 

MNW-27a 

MNW-6b 

MNW-14b 

MNW-34a 

MNW-17 

MNW-29a 



S-adenyl homocysteine hydrolase 

Glutathione synthetase 

proliferation-associated protein p38-2G4 

phosphoribosyl pyrophosphate synthetase-associated protein 39 

unknown gene, identical to sequence tags from pancreas, uterus 
etc. 

unknown gene, identical to sequence tags from lung, brain, 
fibroblast etc. 

unknown gene, identical to sequence tags from multiple tissues 

unknown gene, identical to sequence tags from pancreas and fetus 

unknown gene, no significant sequence homology, universally 
expressed 
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Example 3 

The two clones for CT7, referred to supra, were 2 1 84 and 1 965 base pairs long. Analysis 
of the longer one was carried out. It presented an open reading frame of 543 amino acids, which 
extended to the 5' end of the sequence, indicating that it was a partial cDNA clone. 

In order to identify the complete sequence, and to try to identify additional, related genes, 
a human testicular cDNA library was prepared, following standard methods, and screened with 
probes derived from the longer sequence, following standard methods. 

Eleven positives were detected, and sequenced, and it was found that all derived from the 
same gene. When the polyA tail was excluded, full length transcript, as per SEQ ID NO: 1, 
< : consisted of 4265 nucleotides, broken down into 286 base pairs of untranslated 5' - region, a 

coding region of 3429 base pairs, and 550 base pairs of untranslated 3* region. The predicted 
protein is 1 142 amino acids long, and has a calculated molecular mass of about 125 kilodaltons. 
See SEQ ID NO: 2. 

The nucleotide and deduced amino acid sequences were screened against known 
databases, and there was some homology with the MAGE- 10 gene, described by DePlaen et al., 
Immunogenetics 40: 360-369 (1994). The homology was limited to about 210 carboxy terminal 
amino acids, i.e., amino acids 908-1 1 1 5 of the subject sequence, and 134-342 of MAGE- 10. The 
percent homology was 56%, rising to 75% when conservative changes are included. 

There was also extensive homology with a sequence reported by Lucas et al., Cane. Res. 
58: 743-752 (1998), and application Serial No. 08/845,528 filed April 25, 1997, also 
incorporated by reference. A total of 14 nucleotides differ in the open reading frame, resulting 
in a total of 1 1 amino acids which differ between the sequences. 
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The 5 f region of the nucleotide and sequence and corresponding amino acid sequence 
demonstrates a strikingly repetitive pattern, with repeats rich in serine, proline, glutamine, and 
leucine, with an almost invariable core of PQSPLQI (SEQ ID NO: 3). In the. middle of the 
molecule, 1 1 almost exact repeats of 35 amino acids were observed. The repetitive portions 
make up about 70% of the entire sequetice, begin shortly after translation initiation, at position 
15, and ending shortly before the region homologous to MAGE 4a. 

Example 4 

The expression pattern for mRNA of CT7 was then studied, in both normal and malignant 
tissues. RT-PCR was used, employing primers specific for the gene. The estimated melting 
temperature of the primers was 65-70°C, and they were designed to amplify 300-600 base pair 
segments. A total of 35 amplification cycles were carried out, at an annealing temperature of 
60°C. Table 3, which follows, presents the data for human tumor tissues. CT7 was expressed 
in a number of different samples. Of fourteen normal tissues tested, there was strong expression 
in testis, and none in colon, brain, adrenal, lung, breast, pancreas, prostate, thymus or uterus 
tissue. There was low level expression in liver, kidney, placenta and fetal brain, with fetal brain 
showing three transcripts of different size. The level of expression was at least 20-50 times lower 
than in testis. Melanoma cell lines were also screened. Of these 7 of the 12 tested showed strong 
expression, and one showed weak expression. 

Table 3. CT7 mRNA expression in various humor tumors by RT-PCR 
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Tumor type 


mRNA, positive/total 


Melanoma 


7/10 


Breast cancer 


3/10 


Lung cancer 


3/9 


Head/neck cancer 


5/14 


Bladder cancer 


4/9 


Colon cancer 


1/10 


Leimyosarcoma 


1/4 


synovial sarcoma 


2/4 


Total 


26/70 



Example 5 

Southern blotting experiments were then carried out to determine if CT7 belonged to a 
family of genes. In these experiments, genomic DNA was extracted from normal human tissues. 
It was digested with BamHI, EcoRI, and HindlH, separated on a 0.7% agarose gel, blotted onto 
a nitrocellulose filter, and hybridized, at high stringency (65 °C, aqueous buffer), with a 32 P 
labelled probe, derived from SEQ ID NO: 1. 

The blotting showed anywhere from two to four bands, suggesting one or two genes in 
the family. 

Example 6 

As noted in example 2, supra, thirty three of the sixty one positive clones were related 
to KOC. Clones were sequenced using standard methodologies. As indicated supra, one clone 
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was identical to KOC, initially reported by Miieller-Pillasch, et al„ supra . Given that .two 
additional related sequences were identified, the known KOC gene is referred to as KOC-1 
hereafter (SEQ ID NO: 4). The second clone, referred to as KOC-2 hereafter, was found once. 
The sequence is presented as SEQ ID NO: 5. Its deduced amino acid sequence is 72.5% identical 
to that for KOC-1. 

The third sequence, KOC-3, appeared thirty times (SEQ ID NO: 6). Its deduced amino 
acid sequence is 63% identical to KOC-1. 

Testicular cDNA libraries were analyzed in the same way that the SK-MEL-37 library 
was analyzed, i.e., with allogeneic serum from NW-38. See example 3, supra . 

Following analysis of testicular libraries, a longer form of KOC-2 was isolated. This is 
^ v presented as SEQ ID NO: 7. When SEQ ID NOS: 5 & 7 are compared, the former is 1 705 base 

pairs in length, without a polyA tail. It contains 1362 base pairs of coding sequence, and 343 
base pairs of 3' untranslated sequence. Nucleotides 275-1942 of SEQ ED NO: 7 are identical to 
nucleotides 38-1705 of SEQ ID NO: 5. 

The sequence of KOC-3, set forth as SEQ ID NO: 6, is 3412 base pairs long, and consists 
of 72 base pairs of 5 1 untranslated region, 1707 base pairs of open reading frame, and 1543 base 
pairs of untranslated, 3' region. An alternate form was also isolated, (SEQ ID NO: 8), and is 129 
base pairs shorter than SEQ ID NO: 6. 

Example 7 

Expression patterns for KOC-1, KOC-2 and KOC-3 were then studied, using RT-PCR 
and the following primer pairs: 

12 

BNSOOClO: <WO 9954738A1J_> 



WO 99/54738 



PCT/US99/05766 



GAAAGTATCT TCAAGGACGC C 

CTGCAAGGGG TTTTGCTGGG CG 
(SEQ ID NOS: 9 & 10). 

TCCTTGCGCG CTGCGGCCTC AG 

CCAACTGGTG GCCATTGAGCT TC 
(SEQIDNOS:ll&12) 

GCTCTTTGGG GACAGGAAGG TC 

GACGTTGACA ACGGCGGTTT CT 
(SEQIDNOS: 13 & 14). 

SEQ ID NOS: 9 & 10 were designed to amplify KOC-1 while SEQ ID NOS: 1 1 & 12 were 
designed to amplify KOC-2, and SEQ ID NOS: 13 & 14 were designed to amplify KOC-3. 

To carry out the RT-PCR, relevant primer pairs were added to cDNA samples prepared 
from various mRNAs by reverse transcription. PCR was then carried out at an annealing 
temperature of 60°C, and extension at 72°C, for 35 cycles. The resulting products were then 
analyzed by gel electrophoresis. 

SEQ ID NOS 9 & 10 amplify nucleotides 305-748 of SEQ ID NO: 1. A variety of 
normal and malignant cell types were tested. Strong expression was found in testis, moderate 
expression in normal brain, and low levels of expression were found in normal colon, kidney, 
and liver. 

The Miieller-Pillasch paper, cited supra , identified expression of KOC-1 in pancreatic 
tumor cell lines, gastric cancer, and normal placenta, via Northern blotting. This paper also 
reported that normal heart, brain, lung, liver, kidney and pancreatic tissue were negative for 
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KOC-1 expression. The difference in results suggests that the level of expression of KOCil is 
very low in normal tissues. 

When KOC-2 expression was studied, the only positive normal tissue was testis (brain, 
liver, kidney and colon were negative). 

Modification of the protocol for detecting KOC-2 resulted in positives in normal kidney, 
liver and melanoma. 

When KOC-3 expression was studied, it was found that the gene was universally 
expressed in normal tissues, with highest expression in testis. 

The pattern of expression of KOC-3 in different melanoma cell lines was analyzed, using 
standard Northern blotting. Over expression in several cell lines was observed, which is 
consistent with the more frequent isolation of this clone than any other. 

Example 8 

A study was carried out to determine if KOC-1 is expressed at higher levels in melanoma 
cells, as compared to normal skin cells. This was done using representational difference analysis, 
or "RDA." See Lisitsyn, et al. Science 259; 946-95 1 (1993), and O'Neill, et al. NucL Acids Res. 
25:2681-2 (1997), both of which are incorporated by reference. Specifically, tester cDNA was 
taken from SK-MEL-37, and driver cDNA was taken from a skin sample representing mRNA 
from various cell types in the skin. The cDNAs were digested with either Tsp509I, Hsp92H, or 
Dpnll. When DpnH was the enzyme used for digestion, adaptor oligonucleotides R-Bgl-24, J- 
Bgl-24, and N-Bgl-24 described by O'Neill, et al., supra, and Hubank, et al. NucL Acids Res. 
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a wi,pnT<n509I was the endonuclease, the same adaptors were 
22:5640-5648 (1994) were used. When Tsp5Wi was u» 

used, as were R-Tsp-12, i.e.: 



AATTTGCGGT GA 



(SEQIDNO: 15) 
J-Tsp-12, i.e.: 

(SEQIDNO: 16) 
andN-Tsp-12, i.e. 



AATTTGTTCATG 



AATTTTCCCT CG 



(SEQIDNO: 17) 

When Hsp92n was the endonuclease, the adaptors were: 
R-Hsp-24, i.e.: 

AGCACTCTCC AGCCTCTCAC CATG 



ACCGACGTCG ACTATCATG CATG 



(SEQIDNO: 18); 
J-Hsp-24, i.e.: 

(SEQIDNO: 19); 
N-Hsp-24, i.e.: 

(SEQIDNO: 20); 
R-Hsp-8, i.e.: 

GTGAGAGG 

(SEQIDNO: 21); 

15 



AGGCAACTGT GCTATCCGAG CATG 



WO 99/54738 



PCT/US99/05766 



J-Hsp-8, i.e.: 

CATGGATG 

(SEQIDNO: 22); 
N-Hsp-8, i.e.: 

- CTCGGATA 

(SEQIDNO: 23). 

In order to hybridize tester and driver, either 3XEE buffer (30mM EPPS, pH8, 3mM 
EDTA), or a buffer of 2.4M tetraethylammonium chloride (TEAC1) 3mM EDTA, lOmM Tris 
HC 1 , pH8, was used. When DNA was dissolved in 1 0 /A of TEAC1 buffer, it was denatured at 
80°C for 1 0 minutes, followed by renaturing at 42 °C for 20 hours. Amplicons were gel purified, 
and the DP3 or DP2 product was ligated into BamHI (when DpnII was used), EcoRI (when Tsp 
5091 was used), or SpHI (when Hsp92II was used), cloning vectors were digested, and then 
sequenced. Sequence analysis of the cDNA molecules derived from these experiments identified 
KOC-1 as one of the genes isolated, indicating that KOC-1 mRNA is present at a higher level 
in Sk-Mel 37 cells as compared to normal skin cells. 

The foregoing examples describe the isolation of a nucleic acid molecule which encodes 
a cancer associated antigen. "Associated" is used herein because while it is clear that the relevant 
molecule was expressed by several types of cancer, other cancers, not screened herein, may also 
express the antigen. 

The invention relates to those nucleic acid molecules which encode the antigens CT7, 
KOC-2 and KOC-3 , as described herein, such as a nucleic acid molecule consisting of the 
nucleotide sequence SEQ ID NO: 1, molecules comprising the nucleotide sequence of SEQ ID 
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NO: 5, 6, 7 or 8 and so forth. Also embraced are those molecules which are not identical to SEQ 
ID NOS: 1, 5, 6, 7 or 8, but which encode the same antigen. 

Also a part of the invention are expression vectors which incorporate the nucleic acid 
molecules of the invention, in operable linkage (i.e., "operably linked") to a promoter. 
Construction of such vectors, such as viral (e.g., adenovirus or Vaccinia virus) or attenuated viral 
vectors is well within the skill of the art, as is the transformation or transfection of cells, to 
produce eukaryotic cell lines, or prokaryotic cell strains which encode the molecule of interest. 
Exemplary of the host cells which can be employed in this fashion are COS cells, CHO cells, 
yeast cells, insect cells (e.g., Spodoptera frugiperda) . NIH 3T3 cells, and so forth. Prokaryotic 
cells, such as E. coH and other bacteria may also be used. Any of these cells can also be 
transformed or transfected with further nucleic acid molecules, such as those encoding cytokines, 
e.g., interleukins such as IL-2, 4, 6, or 12 or HLA or MHC molecules. 

Also a part of the invention are the antigens described herein, both in original form and 
in any different post translational modified forms. The molecules are large enough to be 
antigenic without any posttranslational modification, and hence are useful as immunogens, when 
combined with an adjuvant (or without it), in both precursor and post-translationally modified 
forms. Antibodies produced using these antigens, both poly and monoclonal, are also a part of 
the invention as well as hybridomas which make monoclonal antibodies to the antigens. The 
whole protein can be used therapeutically, or in portions, as discussed infra . Also a part of the 
invention are antibodies against this antigen, be these polyclonal, monoclonal, reactive 
fragments, such as Fab, (F(ab) 2 ' and other fragments, as well as chimeras, humanized antibodies, 
recombinantly produced antibodies, and so forth. 
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As is clear from the disclosure, one may use the proteins and nucleic acid molecules of 
the invention diagnostically. The SEREX methodology discussed herein is premised on an 
immune response to a pathology associated antigen. Hence, one may assay for the relevant 
pathology via, e.g., testing a body fluid sample of a subject, such as serum, for reactivity with 
the antigen per se. Reactivity would be deemed indicative of possible presence of the pathology. 
So, too, could one assay for the expression of any of the antigens via any of the standard nucleic 
acid hybridization assays which are well known to the art, and need not be elaborated upon 
herein. One could assay for antibodies against the subject molecules, using standard 
immunoassays as well. 

Analysis of SEQ ID NO: 1, 5, 6, 7 and 8 will show that there are 5 1 and 3* non-coding 
regions presented therein. The invention relates to those isolated nucleic acid molecules which 
contain at least the coding segment, i.e., nucleotides 54-593, of SEQ ID NO: 1, nucleotides 1- 
1019 of SEQ ID NO: 3, nucleotides 73-1780 of SEQ ID NO: 8, and so forth, and which may 
contain any or all of the non-coding 5* and 3" portions. 

Also a part of the invention are portions of the relevant nucleic acid molecules which can 
be used, for example, as oligonucleotide primers and/or probes, such as one or more of SEQ ID 
NOS: 7, 8, 9, 10, 1 1, 12, 13 or 14 as well as amplification product like nucleic acid molecules 
comprising at least nucleotides 305-748 of SEQ ID NO: 1. 

As was discussed supra, study of other members of the "CT" family reveals that these are 
also processed to peptides which provoke lysis by cytolytic T cells. There has been a great deal 
of work on motifs for various MHC or HLA molecules, which is applicable here. Hence, a 
further aspect of the invention is a therapeutic method, wherein one or more peptides derived 
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from the antigens of the invention which bind to an HLA molecule on the surface of a patient's 
tumor cells are administered to the patient, in an amount sufficient for the peptides to bind to the 
MHC/HLA molecules, and provoke lysis by T cells. Any combination of peptides may be used. 
These peptides, which may be used alone or in combination, as well as the entire protein or 
immunoreactive portions thereof, may be administered to a subject in need thereof, using any of 
the standard types of administration, such as intravenous, intradermal, subcutaneous, oral, rectal, 
and transdermal administration. Standard pharmaceutical carriers, adjuvants, such as saponins, 
GM-CSF, and interleukins and so forth may also be used. Further, these peptides and proteins 
may be formulated into vaccines with the listed material, as may dendritic cells, or other cells 
which present relevant MHC/peptide complexes. 

Similarly, the invention contemplates therapies wherein nucleic acid molecules which 
encode the proteins of the invention, one or more or peptides which are derived from these 
proteins are incorporated into a vector, such as a Vaccinia or adenovirus based vector, to render 
it transferable into eukaryotic cells, such as human cells. Similarly, nucleic acid molecules 
which encode one or more of the peptides may be incorporated into these vectors, which are then 
the major constituent of nucleic acid bases therapies. 

Any of these assays can also be used in progression/regression studies. One can monitor 
the course of abnormality involving expression of these antigens simply by monitoring levels of 
the protein, its expression, antibodies against it and so forth using any or all of the methods set 
forth supra . 

It should be clear that these methodologies may also be used to track the efficacy of a 
therapeutic regime. Essentially, one can take a baseline value for a protein of interest using any 
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of the assays discussed supra, administer a given therapeutic agent, and then monitor levels of 
the protein thereafter, observing changes in antigen levels as indicia of the efficacy of the regime. 

As was indicated supra, the invention involves, inter alia, the recognition of an 
"integrated" immune response to the molecules of the invention. One ramification of this is the 
ability to monitor the course of cancer therapy. In this method, which is a part of the invention, 
a subject in need of the therapy receives a vaccination of a type described herein. Such a 
vaccination results, e.g., in a T cell response against cells presenting HLA/peptide complexes on 
their cells. The response also includes an antibody response, possibly a result of the release of 
antibody provoking proteins via the lysis of cells by the T cells. Hence, one can monitor the 
effect of a vaccine, by monitoring an antibody response. As is indicated, supra, an increase in 
antibody titer may be taken as an indicia of progress with a vaccine, and vice versa. Hence, a 
further aspect of the invention is a method for monitoring efficacy of a vaccine, following 
administration thereof, by determining levels of antibodies in the subject which are specific for 
the vaccine itself, or a large molecule of which the vaccine is a part. 

The identification of the subject proteins as being implicated in pathological conditions 
such as cancer also suggests a number of therapeutic approaches in addition to those discussed 
supra . The experiments set forth supra establish that antibodies are produced in response to 
expression of the protein. Hence, a further embodiment of the invention is the treatment of 
conditions which are characterized by aberrant or abnormal levels of one or more of the proteins, 
via administration of antibodies, such as humanized antibodies, antibody fragments, and so forth. 
These may be tagged or labelled with appropriate cystostatic or cytotoxic reagents. 
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T cells may also be administered. It is to be noted that the T cells may be elicited in vitro 
using immune responsive cells such as dendritic cells, lymphocytes, or any other immune 
responsive cells, and then reperfused into the subject being treated. 

Note that the generation of T cells and/or antibodies can also be accomplished by 
administering cells, preferably treated to be rendered non-proliferative, which present relevant 
T cell or B cell epitopes for response, such as the epitopes discussed supra . 

The therapeutic approaches may also include antisense therapies, wherein an antisense 
molecule, preferably from 10 to 100 nucleotides in length, is administered to the subject either 
"neat" or in a carrier, such as a liposome, to facilitate incorporation into a cell, followed by 
inhibition of expression of the protein. Such antisense sequences may also be incorporated into 
appropriate vaccines, such as in viral vectors (e.g., Vaccinia), bacterial constructs, such as 
variants of the known BCG vaccine, and so forth. 

Also a part of the inventions are Peptides, such as those set forth in Figure 1, and those 
which have as a core sequence 

PQSPLQI(SEQIDNO.: 3) 
These peptides may be used therapeutically, via administration to a patient who expresses CT7 
in connection with a pathology, as well as diagnostically, i.e., to determine if relevant antibodies 
are present and so forth. 

Other features and applications of the invention will be clear to the skilled artisan, and 
need not be set forth herein. The terms and expression which have been employed are used as 
terms of description and not of limitation, and there is no intention in the use of such terms and 
expression of excluding any equivalents of the features shown and described or portions thereof, 
it being recognized that various modifications are possible within the scope of the invention. 
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We claim : 

1 . Isolated nucleic acid molecule which encodes a cancer associated antigen, whose 
amino acid sequence is identical to the amino sequence encoded by nucleotides 287 to 37 14 of 
SEQIDNO:l. 

2. The isolated nucleic acid molecule of claim 1 , consisting of nucleotides 287-3714 
ofSEQIDNO:l. 

3. The isolated nucleic acid molecule of claim 1, consisting of anywhere from 
nucleotide 1 through nucleotide 4265 of SEQ ID NO: 1, with the proviso that said isolated 
nucleic acid molecule contains at least nucleotides 287-3714 of SEQ ID NO: 1. 

4. Expression vector comprising the isolated nucleic acid molecule of claim 1, 
operably linked to a promoter. 

5. Expression vector comprising the isolated nucleic acid molecule of claim 3, 
operably linked to a promoter. 

6. Eukaryotic cell line or prokaryotic cell strain, transformed or transfected with the 
expression vector of claim 4. 

7. Eukaryotic cell line or prokaryotic cell strain, transformed or transfected with the 
expression vector of claim 5. 

8. Isolated cancer associated antigen comprising all or part of the amino acid 
sequence encoded by nucleotides 287-37 1 4 of SEQ ID NO: 1 . 

9. Eukaryotic cell line or prokaiyote cell strain, transformed or transfected with the 
isolated nucleic acid molecule of claim 1. 

1 0. The eukaryotic cell line of claim 9, wherein said cell line is also transfected with 
a nucleic acid molecule coding for a cytokine. 
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1 1 . The eukaryotic cell line of claim 10, wherein said cell line is further transfected 
by a nucleic acid molecule coding for an HLA molecule. 

12. The eukaryotic cell line of claim 10, wherein said cytokine is an interleukin. 

13. The biologically pure culture of claim 12, wherein said interleukin is IL-2, IL-4 
orIL-12. 

14. The eukaryotic cell line of claim 9, wherein said cell line has been rendered non- 
proliferative. 

15. The eukaryotic cell line of claim 9, wherein said cell line is a fibroblast cell line. 

16. Expression vector comprising a mutated or attenuated virus and the isolated 
nucleic acid molecule of claim 1 . 

17. The expression vector of claim 16, wherein said virus is adenovirus or vaccinia 

virus. 

18. The expression vector of claim 17, wherein said virus is vaccinia virus. 

19. The expression vector of claim 17, wherein said virus is adenovirus. 

20. Expression system useful in transfecting a cell, comprising (i) a first vector 
containing a nucleic acid molecule which codes for the isolated cancer associated antigen of 
claim 8 and (ii) a second vector selected from the group consisting of (a) a vector containing a 
nucleic acid molecule which codes for an MHC or HLA molecule which presents an antigen 
derived from said cancer associated antigen and (b) a vector containing a nucleic acid molecule 
which codes for an interleukin. 

2 1 . Isolated cancer associated antigen comprising the amino acid sequence encoded 
by nucleotides 287-3714 of SEQ ID NO: 1. 
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22. Immunogenic composition comprising the isolated antigen of claim 21, and a 
pharmaceutical^ acceptable adjuvant. 

23 . The immunogenic composition of claim 22, wherein said adjuvant is a cytokine, 
a saponin, or GM-CSF. 

24. Immunogenic composition comprising at least one peptide consisting of an amino 
acid sequence of from 8 to 12 amino acids concatenated to each other in the isolated cancer 
associated antigen of claim 21, and a pharmaceutically acceptable adjuvant. 

25. The immunogenic composition of claim 24, wherein said adjuvant is a saponin, 
a cytokine, or GM-CSF. 

< 26. The immunogenic composition of claim 24, wherein said composition comprises 

a plurality of peptides which complex with a specific MHC molecule. 

27. Isolated peptide derived from the amino acid sequence encoded by SEQ ID NO: 
1 , wherein said isolated peptide binds to an HLA molecule, is a nonamer, decamer or undecamer, 
and comprises the amino acid sequence of SEQ ID NO: 3, from one to three additional N- 
terminal amino acid, and up to four additional C terminal amino acids. 

28. Immunogenic composition which comprises at least one expression vector which 
encodes for a peptide derived from the amino acid sequence encoded by SEQ ED NO: 1, and an 
adjuvant or carrier. 

29. The immunogenic composition of claim 28, wherein said at least one expression 
vector codes for a plurality of peptides. 

30. Vaccine useful in treating a subject afflicted with a cancerous condition 
comprising the isolated cell line of claim 1 1 and a pharmacologically acceptable adjuvant. 

24 



WO 99/54738 



PCT/US99/05766 



31. The vaccine of claim 30, wherein said cell line has been rendered non- 
proliferative. 

32. The vaccine of claim 3 1 , wherein said cell line is a human cell line. 

33. A composition of matter useful in treating acancerous condition comprising anon 
proliferative cell line having expressed on its surface a peptide derived from the amino acid 
sequence encoded by SEQ ID NO: 1. 

34. The composition of matter of claim 33, wherein said cell line is a human cell line. 

35. A composition of matter useful in treating a cancerous condition, comprising (i) 
a peptide derived from the amino acid sequence encoded by SEQ ID NO: 1 , (ii) an MHC or HLA 
molecule, and (iii) a pharmaceutical^ acceptable carrier. 

36. Isolated antibody which is specific for the antigen of claim 21 . 

37. The isolated antibody of claim 36, wherein said antibody is a monoclonal 
antibody. 

38. Method for screening for cancer in a sample, comprising contacting said sample 
with a nucleic acid molecule which hybridizes to all or part of SEQ ID NO: 1 , and determining 
hybridization as an indication of cancer cells in said sample. 

39. Amethod for screening for cancer in a sample, comprising contacting said sample 
with the isolated antibody of claim 36, and determining binding of said antibody to a target as 
an indicator of cancer. 

40. Method for diagnosing a cancerous condition in a subject, comprising contacting 
an immune reactive cell containing sample of said subject to a cell line transfected with the 
isolated nucleic acid molecule of claim 1 , and determining interaction of said transfected cell line 
with said immunoreactive cell, said interaction being indicative of said cancer condition. 
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41. A method for determining regression, progression of onset of a cancerous 
condition comprising monitoring a sample from a patient with said cancerous condition for a 
parameter selected from the group consisting of (i) CT7 protein, (ii) a peptide derived from CT7 
protein (iii) cytolytic T cells specific for said peptide and an MHC molecule with which it non- 
covalently complexes, and (iv) antibodies specific for said CT7 protein, wherein amount of said 
parameter is indicative of progression or regression or onset of said cancerous condition. 

42. Method of claim 41, wherein said sample is a body fluid or exudate. 

43. Method of claim 41, wherein said sample is a tissue. 

44. Method of claim 41, comprising contacting said sample with an antibody which 
specifically binds with said protein or peptide. 

45 . Method of claim 44, wherein said antibody is labelled with a radioactive label or 
an enzyme. 

46. Method of claim 44, wherein said antibody is a monoclonal antibody. 

47. Method of claim 41, comprising amplifying RNA which codes for said protein. 

48. Method of claim 47, wherein said amplifying comprises carrying out polymerase 
chain reaction. 

49. Method of claim 41, comprising contacting said sample with a nucleic acid 
molecule which specifically hybridizes to a nucleic acid molecule which codes for or expresses 
said protein. 

50. Method of claim 41 , comprising assaying said sample for shed protein. 

5 1 . Method of claim 41 , comprising assaying said sample for antibodies specific for 
said CT7 protein, by contacting said sample with CT7 protein. 
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52 . Method for diagnosing a cancerous condition comprising assaying a sample taken 
from a subject for an immunoreactive cell specific for a peptide derived from CT7, complexed 
to an MHC molecule, presence of said immunoreactive cell being indicative of said cancerous 
condition. 

53. An isolated nucleic acidf molecule which encodes a protein and which has a 
complementary sequence which hybridizes, under stringent conditions, to at least one of the 
nucleotide sequences set forth at SEQ ID NO: 5, 6, 7 or 8. 

54. The isolated nucleic acid molecule of claim 53, wherein said protein is the protein 
encoded by the nucleotide sequence of SEQ ID NO: 5, 6, 7 or 8. 

^ 55. The isolated nucleic acid molecule of claim 53, selected from the group consisting 

of nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 5, 6, 7 or 8. 

56. Expression vector comprising the isolated nucleic acid molecule of claim 54, 
operably linked to a promoter. 

57. Expression vector comprising the isolated nucleic acid molecule of claim 55, 
operably linked to a promoter. 

58. Recombinant cell comprising the expression vector of claim 56. 

59. Recombinant cell comprising the expression vector of claim 57. 

60. Recombinant cell comprising the isolated nucleic acid molecule of claim 54. 

61. Recombinant cell comprising the isolated nucleic acid molecule of claim 55. 

62. Recombinant cell of claim 58, further comprising an expression vector which 
contains a nucleic acid molecule encoding a cytokine, operably linked to a promoter. 

63. Recombinant cell of claim 59, further comprising an expression vector which 
contains a nucleic acid molecule encoding a cytokine, operably linked to a promoter. 
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64. Recombinant cell of claim 60, further comprising a nucleic acid molecule which 
encodes a cytokine. 

65. Recombinant cell of claim 61 , further comprising a nucleic acid molecule which 
encodes a cytokine. 

66. The recombinant cell of claim 62, 63, 64, or 65, wherein said cytokine is 
interleukin. 

67. The recombinant cell of claim 66, wherein said interleukin is 1L-2, 1L-4, or 1L- 

12. 

68. The recombinant cell of claim 58, 59, 60, or 61 , wherein said recombinant cell 
is a eukaryotic cell. 

69. The recombinant cell of claim 68, which has been rendered non-proliferative. 

70. The recombinant cell of claim 68, wherein said cell is a fibroblast. 

71. Expression vector comprising a mutated or attenuated virus and the isolated 
nucleic acid molecule of claim 53, 54 or 55. 

72. The expression vector of claim 71, wherein said virus is adenovirus, adeno 
associated virus, or vaccinia virus. 

73. Expression system useful in making a recombinant cell, comprising: 

(i) a first vector which encodes the protein encoded by the isolated nucleic 
acid molecule of claim 53, 54 or 55, and 

(ii) a second vector which either (a) encodes an MHC or HLA molecule or (b) 
encodes an interleukin. 

74. An isolated cancer associated antigen comprising the amino acid sequence 
encoded by SEQ ID NO: 5, 6, 7 or 8. 
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75. Composition comprising the isolated cancer associated antigen of claim 74, and 
a pharmaceutically acceptable adjuvant, 

76. The composition of claim 75, wherein said adjuvant is a cytokine, a saponin, or 
GM-CSF. 

77. Composition comprising at least one peptide consisting of an amino acid sequence 
of from 8 to 25 amino acids concatenated to each other in the isolated cancer associated antigen 
of claim 74, and a pharmaceutically acceptable adjuvant. 

78. The composition of claim 77, wherein said adjuvant is a saponin, a cytokine, or 
GM-CSF. 

79. The composition of claim 77, comprising a plurality of MHC binding peptides. 

80. Composition comprising an expression vector which encodes at least one peptide 
consisting of an amino acid sequence of from 8 to 25 amino acids concatenated to each other in 
the isolated cancer associated antigen of claim 74, and pharmaceutically acceptable adjuvant. 

8 1 . The composition of claim 80, wherein said expression vector encodes a plurality 
of peptides. 

82. Composition useful in treating a subject afflicted with a cancer, comprising the 
recombinant cell of claim 69 and a pharmacologically acceptable adjuvant. 

83. The composition of claim 82, wherein said recombinant cell expresses an HLA 
or MHC molecule. 

84. The composition of claim 82, wherein said recombinant cell is a human cell. 

85. The composition of claim 77, further comprising at least one MHC or HLA 
molecule. 
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86. Isolated antibody which specifically binds to the isolated cancer associated 
antigen of claim 74. 

87. The isolated antibody of claim 86, wherein said antibody is a monoclonal 
antibody. 

88. A method for screening^ for possible presence of a pathological condition, 
comprising assaying a sample from a patient believed to have a pathological condition for 
antibodies specific to at least one of the cancer associated antigens encoded by SEQ ID NOS: 4, 
5, 6, 7 or 8, presence of said antibodies being indicative of possible presence of said pathological 
condition. 

89. The method of claim 88, wherein said pathological condition is cancer. 

90. The method of claim 89, wherein said cancer is melanoma. 

91 . The method of claim 90, further comprising contacting said sample to purified 
cancer associated antigen encoded by SEQ ID NO: 4, 5, 6, 7 or 8. 

92. A method for screening for possible presence of a pathological condition in a 
subject, comprising assaying a sample taken from said subject for expression of a nucleic acid 
molecule, the nucleotide sequence of which comprises SEQ ID NO: 5, 6, 7 or 8, expression of 
said nucleic acid molecule being indicative of possible presence of said pathological condition. 

93. The method of claim 92, wherein said pathological condition is cancer. 

94. The method of claim 92, comprising determining expression viapolymerase chain 
reaction. 

95. The method of claim 92, comprising determining expression by contacting said 
sample with at least one of SEQ ID NO: 11, 12, 13 or 14. 
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96. A method for determining regression, progression of onset of a cancerous 
condition comprising monitoring a sample from a patient with said cancerous condition for a 
parameter selected from the group consisting of (i) a cancer associated antigen encoded by SEQ 
ID NO: 3, 4, 5 or 6, (ii) a peptide derived from said cancer associated antigen, (iii) cytolytic T 
cells specific for said peptide and an MHC molecule with which it non-covalently complexes, 
and (iv) antibodies specific for said cancer associated antigen, wherein amount of said parameter 
is indicative of progression or regression or onset of said cancerous condition. 

97. The method of claim 96, wherein said sample is a body fluid or exudate. 

98. The method of claim 96 , wherein said sample is a tissue. 

99. The method of claim 96, comprising contacting said sample with an antibody 
which specifically binds with said protein or peptide. 

1 00. The method of claim 99, wherein said antibody is labelled with a radioactive label 
or an enzyme. 

1 0 L The method of claim 99, wherein said antibody is a monoclonal antibody. 
102* The method of claim 96, comprising amplifying RNA which codes for said 

protein. 

103. The method of claim 102, wherein said amplifying comprises carrying out 
polymerase chain reaction. 

104. The method of claim 96, comprising contacting said sample with a nucleic acid 
molecule which specifically hybridizes to a nucleic acid molecule which codes for or expresses 
said protein. 

105. The method of claim 96, comprising assaying said sample for shed cancer 
associated antigen. 
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1 06. The method of claim 96, comprising assaying said sample for antibodies specific 
for said cancer associated antigen, by contacting said sample with said cancer associated antigen. 

107. Method for screening for a cancerous condition comprising assaying a sample 
taken from a subject for an immunoreactive cell specific for a peptide derived from a cancer 
associated antigen encoded by SEQ ED NO: 4, 5, 6, 7 or 8 complexed to an MHC molecule, 
presence of said immunoreactive cell being indicative of said cancerous condition. 

108. An isolated nucleic acid molecule consisting of a nucleotide sequence defined by 
SEQ ID NO: 9, 10, 11, 12, 13 or 14. 

109. Kit useful in determining expression of a cancer associated antigen, comprising 
a separate portion of each of (i) the nucleotide sequences defined by SEQ ID NOS: 9 and 1 0, (ii) 
the nucleotide sequences defined by SEQ ID NOS: 1 1 and 12, and (iii) the nucleotide sequences 
defined by SEQ ID NOS: 13 and 14. 
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<210> 1 

<211> 4265 

<212> DNA 

<213> Homo sapiens 

<220> 

<400> 1 

GTCTGAAGGA CCTGAGGCAT TTTGTGACGA 
CTTATAGACC TATCCAGTCT TCAAGGTGCT 
GAGGGACACA TACATCCTAA AAGCACCACA 
AAGGTTCCCA GAAGACAAAC CCCCTAGGAA 
ACCTTAAGAG AAGAAGAGCT GTAAGCCGGC 
TATGCCTACT GCTGGGATGC CGAGTCTTCT 
TCCTGAGGGG GAGGACTCCC AGTCTCCTCT 
CGACACCCTG TATCCTCTCC AGAGTCCTCA 
TCCTCTCCAG AGACCTCCTG AGGGGAAGGA 
TTCTCCTGAG GGCGACGACA CCCAGTCTCC 
GAAGGACTCC CTGTCTCCTC TAGAGATTTC 
GTCTCCTCTG CAGAATCCTG CGAGTTCCTT 
GAGTTCCCCT GAGAGTATTC AAAGTCCTTT 
TCCTGTGAGC GCCGCCTCCT CCTCCACTTT 
TACTCAAAGT CCTTTTGAGG GTTTTCCCCA 
CTTCTCCTCC ACTTTATTGA GTATTTTCCA 
TGAGGGTTTT GCACAGTCTC CTCTCCAGAT 
ACTGAGTCTT TTCCAGAGTT CCCCTGAGAG 
GTCTCCACTC CAGATTCCTG TGAGCCGCTC 
GAGTTCCCCT GAGAGAACTC AGAGTACTTT 
TCCTGTGAGC CCCTCCTTCT CCTCCACTTT 
AACTCAGAGT ACTTTTGAGG GTTTTCCCCA 
CTTCTCCTCC ACTTTATTGA GTCTTTTCCA 
TGAGGGTTTT CCCCAGTCTC CTCTCCAGAT 
ACTGAGTCTT TTCCAGAGTT CCCCTGAGAG 
GTCTCCTCTC CAGATTCCTA TGACCTCCTC 
GAGTTCTCCT GAGAGTGCTC AAAGTGCTTT 
TCCTGTGAGC TCCTCTTTCT CCTACACTTT 
AACTCACAGT ACTTTTGAGG GTTTTCCCCA 
CTCCTCCTCC TCCACTTTAT TGAGTCTTTT 
TTTTGAGGGT TTTCCCCAGT CTCCTCTCCA 
TACCCATTCT CCTCTCCAGA TTGTTCCAAG 
TCACTACTTT CCTCAGAGCC CTCCTCAGGG 
TCAGAGCCCT CCTCAGGGGG AGGACTCCCT 
GGGGGAGGAC TCCCTGTCTC CTCACTACTT 
CATGTCTCCT CTCTACTTTC CTCAGAGTCC 
CCAGAGCCCT GTGAGCATCT GCTCCTCCTC 
TGAGAGTTCT CAGAGTCCTC CTGAGGGGCC 



GGATCGTCTC AGGTCAGCGG AGGGAGGAGA 60 
CCAGAAAGCA GGAGTTGAAG ACCTGGGTGT 120 
GCAGAGGAGG CCCAGGCAGT GCCAGGAGTC 180 
GACAGGCGAC CTGTGAGGCC CTAGAGCACC 240 
CTTTGTCAGA GCCATCATGG GGGACAAGGA 300 
CCAGAGTTCC TCTGAGAGTC CTCAGAGTTG 360 
CCAGATTCCC CAGAGTTCTC CTGAGAGCGA 420 
GAGTCGTTCT GAGGGGGAGG ACTCCTCGGA 480 
CTCCCAGTCT CCTCTCCAGA TTCCCCAGAG 540 
TCTCCAGAAT TCTCAGAGTT CTCCTGAGGG 600 
TCAGAGCCCT CCTGAGGGTG AGGATGTCCA 660 
CTTCTCCTCT GCTTTATTGA GTATTTTCCA 720 
TGAGGGTTTT CCCCAGTCTG TTCTCCAGAT 780 
AGTGAGTATT TTCCAGAGTT CCCCTGAGAG 840 
GTCTCCACTC CAGATTCCTG TGAGCCGCTC 900 
GAGTTCCCCT GAGAGAAGTC AGAGAACTTC 960 
TCCTGTGAGC TCCTCCTCGT CCTCCACTTT 1020 
AACTCAGAGT ACTTTTGAGG GTTTTCCCCA 1080 
CTTCTCCTCC ACTTTATTGA GTATTTTCCA 1140 
TGAGGGTTTT GCCCAGTCTC CTCTCCAGAT 1200 
AGTGAGTATT TTCCAGAGTT CCCCTGAGAG 1260 
GTCTCCTCTC CAGATTCCTG TGAGCTCCTC 1320 
GAGTTCCCCT GAGAGAACTC AGAGTACTTT 1380 
TCCTGGAAGC CCCTCCTTCT CCTCCACTTT 1440 
AACTCACAGT ACTTTTGAGG GTTTTCCCCA 1500 
CTTCTCCTCT ACTTTATTGA GTATTTTACA 1560 
TGAGGGTTTT CCCCAGTCTC CTCTCCAGAT 1620 
ATTGAGTCTT TTCCAGAGTT CCCCTGAGAG 1680 
GTCTCCTCTC CAGATTCCTG TGAGCTCCTC 1740 
CCAGAGTTCC CCTGAGTGTA CTCAAAGTAC 1800 
GATTCCTCAG AGTCCTCCTG AAGGGGAGAA 1860 
TCTTCCTGAG TGGGAGGACT CCCTGTCTCC 1920 
GGAGGACTCC CTATCTCCTC ACTACTTTCC 1980 
GTCTCCTCAC TACTTTCCTC AGAGCCCTCA 2040 
TCCTCAGAGC CCTCCTCAGG GGGAGGACTC 2100 
TCTTCAGGGG GAGGAATTCC AGTCTTCTCT 2160 
CACTCCATCC AGTCTTCCCC AGAGTTTCCC 2220 
TGTCCAGTCT CCTCTCCATA GTCCTCAGAG 2280 
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CCCTCCTGAG GGGATGCACT CCCAATCTCC TCTCCAGAGT CCTGAGAGTG CTCCTGAGGG 2340 
GGAGGATTCC CTGTCTCCTC TCCAAATTCC TCAGAGTCCT CTTGAGGGAG AGGACTCCCT 2400 
GTCTTCTCTC CATTTTCCTC AGAGTCCTCC TGAGTGGGAG GACTCCCTCT CTCCTCTCCA 2460 
CTTTCCTCAG TTTCCTCCTC AGGGGGAGGA CTTCCAGTCT TCTCTCCAGA GTCCTGTGAG 2520 
TATCTGCTCC TCCTCCACTT CTTTGAGTCT TCCCCAGAGT TTCCCTGAGA GTCCTCAGAG 2580 
TCCTCCTGAG GGGCCTGCTC AGTCTCCTCT CCAGAGACCT GTCAGCTCCT TCTTCTCCTA 2640 
CACTTTAGCG AGTCTTCTCC AAAGTTCCCA TGAGAGTCCT CAGAGTCCTC CTGAGGGGCC 2700 
TGCCCAGTCT CCTCTCCAGA GTCCTGTGAG CTCCTTCCCC TCCTCCACTT CATCGAGTCT 2760 
TTCCCAGAGT TCTCCTGTGA GCTCCTTCCC CTCCTCCACT TCATCGAGTC TTTCCAAGAG 2820 
TTCCCCTGAG AGTCCTCTCC AGAGTCCTGT GATCTCCTTC TCCTCCTCCA CTTCATTGAG 2880 
CCCATTCAGT GAAGAGTCCA GCAGCCCAGT AGATGAATAT ACAAGTTCCT CAGACACCTT 2940 
GCTAGAGAGT GATTCCTTGA CAGACAGCGA GTCCTTGATA GAGAGCGAGC CCTTGTTCAC 3000 
TTATACACTG GATGAAAAGG TGGACGAGTT GGCGCGGTTT CTTCTCCTCA AATATCAAGT 3060 
GAAGCAGCCT ATCACAAAGG CAGAGATGCT GACGAATGTC ATCAGCAGGT ACACGGGCTA 3120 
CTTTCCTGTG ATCTTCAGGA AAGCCCGTGA GTTCATAGAG ATACTTTTTG GCATTTCCCT 3180 
GAGAGAAGTG GACCCTGATG ACTCCTATGT CTTTGTAAAC ACATTAGACC TCACCTCTGA 3240 
GGGGTGTCTG AGTGATGAGC AGGGCATGTC CCAGAACCGC CTCCTGATTC TTATTCTGAG 3300 
TATCATCTTC ATAAAGGGCA CCTATGCCTC TGAGGAGGTC ATCTGGGATG TGCTGAGTGG 3360 
AATAGGGGTG CGTGCTGGGA GGGAGCACTT TGCCTTTGGG GAGCCCAGGG AGCTCCTCAC 3420 
TAAAGTTTGG GTGCAGGAAC ATTACCTAGA GTACCGGGAG GTGCCCAACT CTTCTCCTCC 3480 
TCGTTACGAA TTCCTGTGGG GTCCAAGAGC TCATTCAGAA GTCATTAAGA GGAAAGTAGT 3540 
AGAGTTTTTG GCCATGCTAA AGAATACCGT CCCTATTACC TTTCCATCCT CTTACAAGGA 3600 
TGCTTTGAAA GATGTGGAAG AGAGAGCCCA GGCCATAATT GACACCACAG ATGATTCGAC 3660 
TGCCACAGAA AGTGCAAGCT CCAGTGTCAT GTCCCCCAGC TTCTCTTCTG AGTGAAGTCT 3720 
AGGGCAGATT CTTCCCTCTG AGTTTGAAGG GGGCAGTCGA GTTTCTACGT GGTGGAGGGC 3780 
CTGGTTGAGG CTGGAGAGAA CACAGTGCTA TTTGCATTTC TGTTCCATAT GGGTAGTTAT 3840 
GGGGTTTACC TGTTTTACTT TTGGGTATTT TTCAAATGCT TTTCCTATTA ATAACAGGTT 3900 
TAAATAGCTT CAGAATCCTA GTTTATGCAC ATGAGTCGCA CATGTATTGC TGTTTTTCTG 3960 
GTTTAAGAGT AACAGTTTGA TATTTTGTAA AAACAAAAAC ACACCCAAAC ACACCACATT 4020 
GGGAAAACCT TCTGCCTCAT TTTGTGATGT GTCACAGGTT AATGTGGTGT TACTGTAGGA 4080 
ATTTTCTTGA AACTGTGAAG GAACTCTGCA GTTAAATAGT GGAATAAAGT AAAGGATTGT 4140 
TAATGTTTGC ATTTCCTCAG GTCCTTTAGT CTGTTGTTCT TGAAAACTAA AGATACATAC 4200 
CTGGTTTGCT TGGCTTACGT AAGAAAGTCG AAGAAAGTAA ACTGTAATAA ATAAAAGTGT 4260 
CAGTG 4265 

<210> 2 

<211> 1142 

<212> PRT 

<213> Homo sapiens 

<220> 

<400> 2 
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<210> 
<211> 
<212> 
<213> 
<220> 
<400> 



3 
7 

PRT 

Homo sapiens 



3 



Pro Gin Ser Pro Leu Gin lie 
1 5 



<210> 4 

<211> 4159 

<212> DNA 

<213> Homo sapiens 

<220> 

<400> 4 

GGTGGATGCG TTTGGGTTGT AGCTAGGCTT TTTCTTTTCT TTCTCTTTTA AAACACATCT 60 
AGACAAGGAA AAAACAAGCC TCGGATCTGA TTTTTCACTC CTCGTTCTTG TGCTTGGTTC 120 
TTACTGTGTT TGTGTATTTT AAAGGCGAGA AGACGAGGGG AACAAAACCA GCTGGATCCA 180 
TCCATCACCG TGGGTGGTTT TAATTTTTCG TTTTTTCTCG TTATTTTTTT TTAAACAACC 240 
ACTCTTCACA ATGAACAAAC TGTATATCGG AAACCTCAGC GAGAACGCCG CCCCCTCGGA 300 
CCTAGAAAGT ATCTTCAAGG ACGCCAAGAT CCCGGTGTCG GGACCCTTCC TGGTGAAGAC 360 
TGGCTACGCG TTCGTGGACT GCCCGGACGA GAGCTGGGCC CTCAAGGCCA TCGAGGCGCT 420 
TTCAGGTAAA ATAGAACTGC ACGGGAAACC CATAGAAGTT GAGCACTCGG TCCCAAAAAG 480 
GCAAAGGATT CGGAAACTTC AGATACGAAA TATCCCGCCT CATTTACAGT GGGAGGTGCT 540 
GGATAGTTTA CTAGTCCAGT ATGGAGTGGT GGAGAGCTGT GAGCAAGTGA ACACTGACTC 600 
GGAAACTGCA GTTGTAAATG TAACCTATTC CAGTAAGGAC CAAGCTAGAC AAGCACTAGA 660 
CAAACTGAAT GGATTTCAGT TAGAGAATTT CACCTTGAAA GTAGCCTATA TCCCTGATGA 720 
AATGGCCGCC CAGCAAAACC CCTTGCAGCA GCCCCGAGGT CGCCGGGGGC TTGGGCAGAG 780 
GGGCTCCTCA AGGCAGGGGT CTCCAGGATC CGTATCCAAG CAGAAACCAT GTGATTTGCC 840 
TCTGCGCCTG CTGGTTCCCA CCCAATTTGT TGGAGCCATC ATAGGAAAAG AAGGTGCCAC 900 
CATTCGGAAC ATCACCAAAC AGACCCAGTC TAAAATCGAT GTCCACCGTA AAGAAAATGC 960 
GGGGGCTGCT GAGAAGTCGA TTACTATCCT CTCTACTCCT GAAGGCACCT CTGCGGCTTG 1020 
TAAGTCTATT CTGGAGATTA TGCATAAGGA AGCTCAAGAT ATAAAATTCA CAGAAGAGAT 1080 
CCCCTTGAAG ATTTTAGCTC ATAATAACTT TGTTGGACGT CTTATTGGTA AAGAAGGAAG 1140 
AAATCTTAAA AAAATTGAGC AAGACACAGA CACTAAAATC ACGATATCTC CATTGCAGGA 1200 
ATTGACGCTG TATAATCCAG AACGCACTAT TACAGTTAAA GGCAATGTTG AGACATGTGC 1260 
CAAAGCTGAG GAGGAGATCA TGAAGAAAAT CAGGGAGTCT TATGAAAATG ATATTGCTTC 1320 
TATGAATCTT CAAGCACATT TAATTCCTGG ATTAAATCTG AACGCCTTGG GTCTGTTCCC 1380 
ACCCACTTCA GGGATGCCAC CTCCCACCTC AGGGCCCCCT TCAGCCATGA CTCCTCCCTA 1440 
CCCGCAGTTT GAGCAATCAG AAACGGAGAC TGTTCATCAG TTTATCCCAG CTCTATCAGT 1500 
CGGTGCCATC ATCGGCAAGC AGGGCCAGCA CATCAAGCAG CTTTCTCGCT TTGCTGGAGC 1560 
TTCAATTAAG ATTGCTCCAG CGGAAGCACC AGATGCTAAA GTGAGGATGG TGATTATCAC 1620 
TGGACCACCA GAGGCTCAGT TCAAGGCTCA GGGAAGAATT TATGGAAAAA TTAAAGAAGA 1680 
AAACTTTGTT AGTCCTAAAG AAGAGGTGAA ACTTGAAGCT CATATCAGAG TGCCATCCTT 1740 
TGCTGCTGGC AGAGTTATTG GAAAAGGAGG CAAAACGGTG AATGAACTTC AGAATTTGTC 1800 
AAGTGCAGAA GTTGTTGTCC CTCGTGACCA GACACCTGAT GAGAATGACC AAGTGGTTGT 1860 
CAAAATAACT GGTCACTTCT ATGCTTGCCA GGTTGCCCAG AGAAAAATTC AGGAAATTCT 1920 
GACTCAGGTA AAGCAGCACC AACAACAGAA GGCTCTGCAA AGTGGACCAC CTCAGTCAAG 1980 
ACGGAAGTAA AGGCTCAGGA AACAGCCCAC CACAGAGGCA GATGCCAAAC CAAAGACAGA 2040 
TTGCTTAACC AACAGATGGG CGCTGACCCC CTATCCAGAA TCACATGCAC AAGTTTTTAC 2100 
CTAGCCAGTT GTTTCTGAGG ACCAGGCAAC TTTTGAACTC CTGTCTCTGT GAGAATGTAT 2160 
ACTTTATGCT CTCTGAAATG TATGACACCC AGCTTTAAAA CAAACAAACA AACAAACAAA 2220 
AAAAGGGTGG GGGAGGGAGG GAAAGAGAAG AGCTCTGCAC TTCCCTTTGT TGTAGTCTCA 2280 
CAGTATAACA GATATTCTAA TTCTTCTTAA TATTCCCCCA TAATGCCAGA AATTGGCTTA 2340 
ATGATGCTTT CACTAAATTC ATCAAATAGA TTGCTCCTAA ATCCAATTGT TAAAATTGGA 2400 
TCAGAATAAT TATCACAGGA ACTTAAATGT TAAGCCATTA GCATAGAAAA ACTGTTCTCA 2460 
GTTTTATTTT TACCTAACAC TAACATGAGT AACCTAAGGG AAGTGCTGAA TGGTGTTGGC 2520 
AGGGGTATTA AACGTGCATT TTTACTCAAC TACCTCAGGT ATTCAGTAAT ACAATGAAAA 2580 
GCAAAATTGT TCCTTTTTTT TGAAAATTTT ATATACTTTA TAATGATAGA AGTCCAACCG 2640 
TTTTTTAAAA AATAAATTTA AAATTTAACA GCAATCAGCT AACAGGCAAA TTAAGATTTT 2700 
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TACTTCTGGC TGGTGACAGT AAAGCTGGAA AATTAATTTC AGGGTTTTTT GAGGCTTTTG 2760 
ACACAGTTAT TAGTTAAATC AAATGTTCAA AAATACGGAG CAGTGCCTAG TATCTGGAGA 2820 
GCAGCACTAC CATTTATTCT TTCATTTATA GTTGGGAAAG TTTTTGACGG TACTAACAAA 2880 
GTGGTCGCAG GAGATTTTGG AACGGCTGGT TTAAATGGCT TCAGGAGACT TCAGTTTTTT 2940 
GTTTAGCTAC ATGATTGAAT GCATAATAAA TGCTTTGTGC TTCTGACTAT CAATACCTAA 3000 
AGAAAGTGCA TCAGTGAAGA GATGCAAGAC TTTCAACTGA CTGGCAAAAA GCAAGCTTTA 3060 
GCTTGTCTTA TAGGATGCTT AGTTTGCCAC TACACTTCAG ACCAATGGGA CAGTCATAGA 3120 
TGGTGTGACA GTGTTTAAAC GCAACAAAAG GCTACATTTC CATGGGGCCA GCACTGTCAT 3180 
GAGCCTCACT AAGCTATTTT GAAGATTTTT AAGCACTGAT AAATTAAAAA AAAAAAAAAA 3240 
AAATTAGACT CCACCTTAAG TAGTAAAGTA TAACAGGATT TCTGTATACT GTGCAATCAG 3300 
TTCTTTGAAA AAAAAGTCAA AAGATAGAGA ATACAAGAAA AGTTTTNGGG ATATAATTTG 3360 
AATGACTGTG AAAACATATG ACCTTTGATA ACGAACTCAT TTGCTCACTC CTTGACAGCA 3420 
AAGCCCAGTA CGTACAATTG TGTTGGGTGT GGGTGGTCTC CAAGGCCACG CTGCTCTCTG 3480 
AATTGATTTT TTGAGTTTTG GNTTGNAAGA" TGATCACAGN CATGTTACAC TGATCTTNAA 3540 
GGACATATNT TATAACCCTT TAAAAAAAAA ATCCCCTGCC TCATTCTTAT TTCGAGATGA 3600 
ATTTCGATAC AGACTAGATG TCTTTCTGAA GATCAATTAG ACATTNTGAA AATGATTTAA 3660 
AGTGTTTTCC TTAATGTTCT CTGAAAACAA GTTTCTTTTG TAGTTTTAAC CAAAAAAGTG 3720 
CCCTTTTTGT CACTGGTTTC TCCTAGCATT CATGATTTTT TTTTCACACA ATGAATTAAA 3780 
ATTGCTAAAA TCATGGACTG GCTTTCTGGT TGGATTTCAG GTAAGATGTG TTTAAGGCCA 3840 
GAGCTTTTCT CAGTATTTGA TTTTTTTCCC CAATATTTGA TTTTTTAAAA ATATACACAT 3900 
AGGAGCTGCA TTTAAAACCT GCTGGTTTAA ATTCTGTCAN ATTTCACTTC TAGCCTTTTA 3960 
GTATGGCNAA TCANAATTTA CTTTTACTTA AGCATTTGTA ATTTGGAGTA TCTGGTACTA 4020 
GCTAAGAAAT AATTCNATAA TTGAGTTTTG TACTCNCCAA ANATGGGTCA TTCCTCATGN 4080 
ATAATGTNCC CCCAATGCAG CTTCATTTTC CAGANACCTT GACGCAGGAT AAATTTTTTC 4140 
ATCATTTAGG TCCCCAAAA 4159 



<210> 5 
<211> 1708 
<212> DNA 

<213> Homo sapiens 

<220> 

<400> 5 

AGGGACGCTG CCGCACCGCC CCAGTTTACC CCGGGGAGCC ATCATGAAGC TGAATGGCCA 60 

CCAGTTGGAG AACCATGCCC TGAAGGTCTC CTACATCCCC GATGAGCAGA TAGCACAGGG 120 

ACCTGAGAAT GGGCGCCGAG GGGGCTTTGG CTCTCGGGGT CAGCCCCGCC AGGGCTCACC 180 

TGTGGCAGCG GGGGCCCCAG CCAAGCAGCA GCAAGTGGAC ATCCCCCTTC GGCTCCTGGT 240 

GCCCACCCAG TATGTGGGTG CCATTATTGG CAAGGAGGGG GCCACCATCC GCAACATCAC 300 

:se$j AAAACAGACC CAGTCCAAGA TAGACGTGCA TAGGAAGGAG AACGCAGGTG CAGCTGAAAA 360 

AGCCATCAGT GTGCACTCCA CCCCTGAGGG CTGCTCCTCC GCTTGTAAGA TGATCTTGGA 420 

fe^ GATTATGCAT AAAGAGGCTA AGGACACCAA AACGGCTGAC GAGGTTCCCC TGAAGATCCT 480 

GGCCCATAAT AACTTTGTAG GGCGTCTCAT TGGCAAGGAA GGACGGAACC TGAAGAAGGT 540 

AGAGCAAGAT ACCGAGACAA AAATCACCAT CTCCTCGTTG CAAGACCTTA CCCTTTACAA 600 

CCCTGAGAGG ACCATCACTG TGAAGGGGGC CATCGAGAAT TGTTGCAGGG CCGAGCAGGA 660 

AATAATGAAG AAAGTTCGGG AGGCCTATGA GAATGATGTG GCTGCCATGA GCTCTCACCT 720 

GATCCCTGGC CTGAACCTGG CTGCTGTAGG TCTTTTCCCA GCTTCATCCA GCGCAGTCCC 780 

GCCGCCTCCC AGCAGCGTTA CTGGGGCTGC TCCCTATAGC TCCTTTATGC AGGCTCCCGA 840 

GCAGGAGATG GTGCAGGTGT TTATCCCCGC CCAGGCAGTG GGCGCCATCA TCGGCAAGAA 900 

GGGGCAGCAC ATCAAACAGC TCTCCCGGTT TGCCAGCGCC TCCATCAAGA TTGCACCACC 960 

CGAAACACCT GACTCCAAAG TTCGTATGGT TATCATCACT GGACCGCCAG AGGCCCAATT 1020 

CAAGGCTCAG GGAAGAATCT ATGGCAAACT CAAGGAGGAG AACTTCTTTG GTCCCAAGGA 1080 

GGAAGTGAAG CTGGAGACCC ACATACGTGT GCCAGCATCA GCAGCTGGCC GGGTCATTGG 1140 

CAAAGGTGGA AAAACGGTGA ACGAGTTGCA GAATTTGACG GCAGCTGAGG TGGTAGTACC 1200 

AAGAGACCAG ACCCCTGATG AGAACGACCA GGTCATCGTG AAAATCATCG GACATTTCTA 1260 

TGCCAGTCAG ATGGCTCAAC GGAAGATCCG AGACATCCTG GCCCAGGTTA AGCAGCAGCA 1320 

TCAGAAGGGA CAGAGTAACC AGGCCCAGGC ACGGAGGAAG TGACCAGCCC CTCCCTGTCC 1380 

CTTNGAGTCC AGGACAACAA CGGGCAGAAA TCGAGAGTGT GCTCTCCCCG GCAGGCCTGA 1440 

GAATGAGTGG GAATCCGGGA CACNTGGGCC GGGCTGTAGA TCAGGTTTGC CCACTTGATT 1500 

GAGAAAGATG TTCCAGTGAG GAACCCTGAT CTNTCAGCCC CAAACACCCA CCCAATTGGC 1560 

CCAACACTGT NTGCCCCTCG GGGTGTCAGA AATTNTAGCG CAAGGCACTT TTAAACGTGG 1620 

ATTGTTTAAA GAAGCTCTCC AGGCCCCACC AAGAGGGTGG ATCACACCTC AGTGGGAAGA 1680 

AAAATAAAAT TTCCTTCAGG TTTTAAAA 1708 
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<210> 6 
<211> 3412 
<212> DNA 

<213> Homo sapiens 

<220> 

<400> 6 

GGCAGCGGAG GAGGCGAGGA GCGCCGGGTA 
AAGAGACGGA TGATGAACAA GCTTTACATC 
GACCTCCGGC AGCTCTTTGG GGACAGGAAG 
TCCGGCTACG CCTTCGTGGA CTACCCCGAC 
CTCTCGGGTA AAGTGGAATT GCATGGGAAA 
AAGCTAAGGA GCAGGAAAAT TCAGATTCGA. 
TTGGATGGAC TTTTGGCTCA ATATGGGACA 
ACAGAAACCG CCGTTGTCAA CGTCACATAT 
GAGAAGCTAA GCGGGCATCA GTTTGAGAAC 
GAAGAGGTGA GCTCCCCTTC GCCCCCTCAG 
GAGCAAGGCC ACGCCCCTGG GGGCACTTCT 
ATCCTGGTCC CCACCCAGTT TGTTGGTGCC 
AACATCACTA AGCAGACCCA GTCCCGGGTA 
GCAGAGAAGC CTGTCACCAT CCATGCCACC 
ATTCTTGAAA TCATGCAGAA AGAGGCAGAT 
AAAATCTTGG CACACAATGG CTTGGTTGGA 
AAGAAAATTG AACATGAAAC AGGGACCAAG 
ATATACAACC CGGAAAGAAC CATCACTGTG 
GAGATAGAGA TTATGAAGAA GCTGCGTGAG 
CAACAAGCCA ATCTGATCCC AGGGTTGAAC 
CTGTCCGTGC TATCTCCACC AGCAGGGCCC 
CCCTTCACTA CCCACTCCGG ATACTTCTCC 
TTCCCGCATC ATCACTCTTA TCCAGAGCAG 
GCTGTGGGCG CCATCATCGG GAAGAAGGGG 
GGAGCCTCTA TCAAGATTGC CCCTGCGGAA 
ATCACCGGGC CACCGGAAGC CCAGTTCAAG 
GAGGAAAACT TCTTTAACCC CAAAGAAGAA 
TCTTCCACAG CTGGCCGGGT GATTGGCAAA 
TTAACCAGTG CAGAAGTCAT CGTGCCTCGT 
ATCGTCAGAA TTATCGGGCA CTTCTTTGCT 
ATTGTACAAC AGGTGAAGCA GCAGGAGCAG 
AGCAAGTGAG GCTCCCACAG GCACCAGCAA 
CTGACAGAAT GAGACCAAAC GCAGCCAGCC 
GAATGAGAAG TCTGCGGAGG CGGCCAGGGA 
CGAGGAGGGG CGGGGAAGGT CAGCCAGGTT 
CCCCAGGGCT TCTGCAGGCT TCAGCCATCC 
CTCCCACGAC GCTATCCCTT TTAGTTGAAC 
AAAATGCACA CCCTTTTTCT GTGGCAAATC 
GGGAAGATGT TAAGATATGT GGCCTGTGGG 
TTTAGAAATA ATATATCAAA TAACTCAACT 
TTTTTCTTTT TAAAGAGAAA GCAGGCTTTT 
GTCTCACGGT GTAGAGAGGA GCTTTGAGGC 
CTCGTCGGAA GGACACTCAC GGCAGTTCTG 
CCGTCTCCTT GAAGAGGAAA CTCTGTCACT 
TCTCTTTGCT TCACAGGTTT TAAACTGGTT 
CTCTCTGTTT ATCTCTCCCC TCCCTCCCCT 
TTTCCTCATC CCTCCATCTC AATCCCGTAT 
GTGCTCTGAG TAT C AC AT C A CACAAAAGGA 
CTTACACTTG GTTACTCAAA AGAACAAGAG 
AGGAAAACAG GAACCCACCA AACCAACCAA 
AAAGAATGTA TTTTGTCTTT TTGCATTTTG 
ATTCCTTTCT TTAAAAAAAA AAATGTGGAG 
CAGGGCGTTA AATTCACAGA TTTTTTTAAC 
GTGTTTTTAC CTCAGCACCT TGCTCTTGTG 
TTGGAGCATT TTTTTATTTT TTTAATAAAA 
GCCAGCCTGG AGAAGGTGAC AGTCCAAGTG 
AGCCAAGAAC CNATATGGCC TTCTTTTGGA 



CCGGGCCGGG GGAGCCGCGG GCTCTCGGGG 60 
GGGAACCTGA GCCCCGCCGT CACCGCCGAC 120 
CTGCCCCTGG CGGGACAGGT CCTGCTGAAG 180 
CAGAACTGGG CCATCCGCGC CATCGAGACC 240 
ATCATGGAAG TTGATTACTC AGTCTCTAAA 300 
AACATCCCTC CTCACCTGCA GTGGGAGGTG 360 
GTGGAGAATG TGGAACAAGT CAACACAGAC 420 
GCAACAAGAG AAGAAGCAAA AATAGCCATG 480 
TACTCCTTCA AGATTTCCTA CATCCCGGAT 540 
CGAGCCCAGC GTGGGGACCA CT.CTTCCCGG 600 
CAGGCCAGAC AGATTGATTT CCCGCTGCGG 660 
ATCATCGGAA AGGAGGGCTT GACCATAAAG 720 
GATATCCATA GAAAAGAGAA CTCTGGAGCT 780 
CCAGAGGGGA CTTCTGAAGC ATGCCGCATG 840 
GAGACCAAAC TAGCCGAAGA GATTCCTCTG 900 
AGACTGATTG GAAAAGAAGG CAGAAATTTG 960 
ATAACAATCT CATCTTTGCA GGATTTGAGC 1020 
AAGGGCACAG TTGAGGCCTG TGCCAGTGCT 1080 
GCCTTTGAAA ATGATATGCT GGCTGTTAAC 1140 
CTCAGCGCAC TTGGCATCTT TTCAACAGGA 1200 
CGCGGAGCTC CCCCCGCTGC CCCCTACCAC 1260 
AGCCTGTACC CCCATCACCA GTTTGGCCCG 1320 
GAGATTGTGA ATCTCTTCAT CCCAACCCAG 1380 
GCACACATCA AACAGCTGGC GAGATTCGCC 1440 
GGCCCAGACG TCAGCGAAAG GATGGTCATC 1500 
GCCCAGGGAC GGATCTTTGG GAAACTGAAA 1560 
GTGAAGCTGG AAGCGCATAT CAGAGTGCCC 1620 
GGTGGCAAGA CCGTGAACGA ACTGCAGAAC 1680 
GACCAAACGC CAGATGAAAA TGAGGAAGTG 1740 
AGCCAGACTG CACAGCGCAA GATCAGGGAA 1800 
AAATACCCTC AGGGAGTCGC CTCACAGCGC 1860 
AACAACGGAT GAATGTAGCC CTTCCAACAC 1920 
AGATCGGGAG CAAACCAAAG ACCATCTGAG 1980 
CTCTGCCGAG GCCCTGAGAA CCCCAGGGGC 2040 
TGCCAGAACC ACCGAGCCCC GCCTCCCGCC 2100 
ACTTCACCAT CCACTCGGAT CTCTCCTGAA 2160 
TAACATAGGT GAACGTGTTC AAAGCCAAGC 2220 
GTCTCTGTAC ATGTGTGTAC ATATTAGAAA 2280 
TTACACAGGG TGCCTGCAGC GGTAATATAT 2340 
AACTCCAATT TTTAATCAAT TATTAATTTT 2400 
CTAGACTTTA AAGAATAAAG TCTTTGGGAG 2460 
CACCCGCACA AAATTCACCC AGAGGGAAAT 2520 
GATCACCTGT GTATGTCAAC AGAAGGGATA 2580 
CCTCATGCCT GTCTAGCTCA TACACCCATT 2640 
TTTTGCATAC TGCTATATAA TTCTCTGTCT 2700 
CCCCTTCTTC TCCATCTCCA TTCTTTTGAA 2760 
CTACGCACCC CCCCCCCCCC AGGCAAAGCA 2820 
ACAAAAGCGA AACACACAAA CCAGCCTCAA 2880 
TCAATGGTAC TTGTCCTAGC GTTTTGGAAG 2940 
TCAACCAAAC AAAGAAAAAA TTCCACAATG 3000 
GTGTATAAGC CATCAATATT CAGCAAAATG 3060 
GAAAGTAGAA ATTTACCAAG GTTGTTGGCC 3120 
GAGAAAAACA CACAGAAGAA GCTACCTCAG 3180 
TTTCCCTTAG AGATTTTGTA AAGCTGATAG 3240 
ATGAGTTGGA AAAAAAATAA GATATCAACT 3300 
TGCAACAGCT GTTCTGAATT GTCTTCCGCT 3360 
CAAACCTTGA AAATGTTTAT TT 3412 
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<210> 7 

<211> 1946 

<212> DNA 

<213> Homo sapiens 

<220> 

<400> 7 

GCTGTAGCGG AGGGGCTGGG GGGCTGCTCT 
CCACCCAGAG GCCGGGGTGG GAGGGCGAGT 
TTCAAATCCG AAATATTCCA CCCCAGCTCC 
AGTATGGTAC AGTAGAGAAC TGTGAGCAAG 
ATGTCACCTA TTCCAACCGG GAGCAGACCA 
AGTTGGAGAA CCATGCCCTG AAGGTCTCCT 
CTGAGAATGG GCGCCGAGGG GGCTTTGGCT- 
TGGCAGCGGG GGCCCCAGCC AAGCAGCAGC 
CCACCCAGTA TGTGGGTGCC ATTATTGGCA 
AACAGACCCA GTCCAAGATA GACGTGCATA 
CCATCAGTGT GCACTCCACC CCTGAGGGCT 
TTATGCATAA AGAGGCTAAG GACACCAAAA 
CCCATAATAA CTTTGTAGGG CGTCTCATTG 
AGCAAGATAC CGAGACAAAA ATCACCATCT 
CTGAGAGGAC CATCACTGTG AAGGGGGCCA 
TAATGAAGAA AGTTCGGGAG GCCTATGAGA 
TCCCTGGCCT GAACCTGGCT GCTGTAGGTC 
CGCCTCCCAG CAGCGTTACT GGGGCTGCTC 
AGGAGATGGT GCAGGTGTTT ATCCCCGCCC 
GGCAGCACAT CAAACAGCTC TCCCGGTTTG 
AAACACCTGA CTCCAAAGTT CGTATGGTTA 
AGGCTCAGGG AAGAATCTAT GGCAAACTCA 
AAGTGAAGCT GGAGACCCAC ATACGTGTGC 
AAGGTGGAAA AACGGTGAAC GAGTTGCAGA 
GAGACCAGAC CCCTGATGAG AACGACCAGG 
CCAGTCAGAT GGCTCAACGG MGATCCGAG 
AGAAGGGACA GAGTAACCAG GCCCAGGCAC 
TNGAGTCCAG GACAACAACG GGCAGAAATC 
ATGAGTGGGA ATCCGGGACA CNTGGGCCGG 
GAAAGATGTT CCAGTGAGGA ACCCTGATCT 
AACACTGTNT GCCCCTCGGG GTGTCAGAAA 
TGTTTAAAGA AGCTCTCCAG GCCCCACCAA 
AATAAAATTT CCTTCAGGTT TTAAAA 



GTCCCCTTCC TTGCGCGCTG CGGCCTCAGC 60 
GCTCAGCTTC CCGGGTTAGG AGCCGGAAAA 120 
GATGGGAAGT ACTGGACAGC CTGCTGGCTC 180 
TGAACACCGA GAGTGAGACG GCAGTGGTGA 240 
GGCAAGCCAT CATGAAGCTG AATGGCCACC 300 
ACATCCCCGA TGAGCAGATA GCACAGGGAC 360 
CTCGGGGTCA GCCCCGCCAG GGCTCACCTG 420 
AAGTGGACAT CCCCCTTCGG CTCCTGGTGC 480 
AGGAGGGGGC CACCATCCGC AACATCACAA 540 
GGAAGGAGAA CGCAGGTGCA GCTGAAAAAG 600 
GCTCCTCCGC TTGTAAGATG ATCTTGGAGA 660 
CGGCTGACGA GGTTCCCCTG AAGATCCTGG 720 
GCAAGGAAGG ACGGAACCTG AAGAAGGTAG 780 
CCTCGTTGCA AGACCTTACC CTTTACAACC 840 
TCGAGAATTG TTGCAGGGCC GAGCAGGAAA 900 
ATGATGTGGC TGCCATGAGC TCTCACCTGA 960 
TTTTCCCAGC TTCATCCAGC GCAGTCCCGC 1020 
CCTATAGCTC CTTTATGCAG GCTCCCGAGC 1080 
AGGCAGTGGG CGCCATCATC GGCAAGAAGG 1140 
CCAGCGCCTC CATCAAGATT GCACCACCCG 1200 
TCATCACTGG ACCGCCAGAG GCCCAATTCA 1260 
AGGAGGAGAA CTTCTTTGGT CCCAAGGAGG 1320 
CAGCATCAGC AGCTGGCCGG GTCATTGGCA 1380 
ATTTGACGGC AGCTGAGGTG GTAGTACCAA 1440 
TCATCGTGAA AATCATCGGA CATTTCTATG 1500 
ACATCCTGGC CCAGGTTAAG CAGCAGCATC 1560 
GGAGGAAGTG ACCAGCCCCT CCCTGTCCCT 1620 
GAGAGTGTGC TCTCCCCGGC AGGCCTGAGA 1680 
GCTGTAGATC AGGTTTGCCC ACTTGATTGA 1740 
NTCAGCCCCA AACACCCACC CAATTGGCCC 1800 
TTNTAGCGCA AGGCACTTTT AAACGTGGAT 1860 
GAGGGTGGAT CACACCTCAG TGGGAAGAAA 1920 

1946 



<210> 8 

<211> 3283 

<212> DNA 

<213> Homo sapiens 

<220> 

<400> 8 

GGCAGCGGAG GAGGCGAGGA GCGCCGGGTA 

AAGAGACGGA TGATGAACAA GCTTTACATC 

GACCTCCGGC AGCTCTTTGG GGACAGGAAG 

TCCGGCTACG CCTTCGTGGA CTACCCCGAC 

CTCTCGGGTA AAGTGGAATT GCATGGGAAA 

AAGCTAAGGA GCAGGAAAAT TCAGATTCGA 

TTGGATGGAC TTTTGGCTCA ATATGGGACA 

ACAGAAACCG CCGTTGTCAA CGTCACATAT 

GAGAAGCTAA GCGGGCATCA GTTTGAGAAC 

GAAGAGGTGA GCTCCCCTTC GCCCCCTCAG 

GAGCAAGGCC ACGCCCCTGG GGGCACTTCT 

ATCCTGGTCC CCACCCAGTT TGTTGGTGCC 

AACATCACTA AGCAGACCCA GTCCCGGGTA 

GCAGAGAAGC CTGTCACCAT CCATGCCACC 

ATTCTTGAAA TCATGCAGAA AGAGGCAGAT 

AAAATCTTGG CACACAATGG CTTGGTTGGA 



CCGGGCCGGG GGAGCCGCGG GCTCTCGGGG 60 

GGGAACCTGA GCCCCGCCGT CACCGCCGAC 120 

CTGCCCCTGG CGGGACAGGT CCTGCTGAAG 180 

CAGAACTGGG CCATCCGCGC CATCGAGACC 240 

ATCATGGAAG TTGATTACTC AGTCTCTAAA 300 

AACATCCCTC CTCACCTGCA GTGGGAGGTG 360 

GTGGAGAATG TGGAACAAGT CAACACAGAC 420 

GCAACAAGAG AAGAAGCAAA AATAGCCATG 480 

TACTCCTTCA AGATTTCCTA CATCCCGGAT 540 

CGAGCCCAGC GTGGGGACCA CTCTTCCCGG 600 

CAGGCCAGAC AGATTGATTT CCCGCTGCGG 660 

ATCATCGGAA AGGAGGGCTT GACCATAAAG 720 

GATATCCATA GAAAAGAGAA CTCTGGAGCT 780 

CCAGAGGGGA CTTCTGAAGC ATGCCGCATG 840 

GAGACCAAAC TAGCCGAAGA GATTCCTCTG 900 

AGACTGATTG GAAAAGAAGG CAGAAATTTG 960 
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AAGAAAATTG AACATGAAAC AGGGACCAAG 
ATATACAACC CGGAAAGAAC CATCACTGTG 
GAGATAGAGA TTATGAAGAA GCTGCGTGAG 
ACCCACTCCG GATACTTCTC CAGCCTGTAC 
CATCACTCTT ATCCAGAGCA GGAGATTGTG 
GCCATCATCG GGAAGAAGGG GGCACACATC 
ATCAAGATTG CCCCTGCGGA AGGCCCAGAC 
CCACCGGAAG CCCAGTTCAA GGCCCAGGGA 
TTCTTTAACC CCAAAGAAGA AGTGAAGCTG 
GCTGGCCGGG TGATTGGCAA AGGTGGCAAG 
GCAGAAGTCA TCGTGCCTCG TGACCAAACG 
ATTATCGGGC ACTTCTTTGC TAGCCAGACT 
CAGGTGAAGC AGCAGGAGCA GAAATACCCT. 
GGCTCCCACA GGCACCAGCA AAACAACGGA 
TGAGACCAAA CGCAGCCAGC CAGATCGGGA 
GTCTGCGGAG GCGGCCAGGG ACTCTGCCGA 
GCGGGGAAGG TCAGCCAGGT TTGCCAGAAC 
TTCTGCAGGC TTCAGCCATC CACTTCACCA 
CGCTATCCCT TTTAGTTGAA CTAACATAGG 
ACCCTTTTTC TGTGGCAAAT CGTCTCTGTA 
TTAAGATATG TGGCCTGTGG GTTACACAGG 
AATATATCAA ATAACTCAAC TAACTCCAAT 
TTAAAGAGAA AGCAGGCTTT TCTAGACTTT 
TGTAGAGAGG AGCTTTGAGG CCACCCGCAC 
AGGACACTCA CGGCAGTTCT GGATCACCTG 
TGAAGAGGAA ACTCTGTCAC TCCTCATGCC 
TTCACAGGTT TTAAACTGGT TTTTTGCATA 
TATCTCTCCC CTCCCTCCCC TCCCCTTCTT 
CCCTCCATCT CAATCCCGTA TCTACGCACC 
GTATCACATC ACACAAAAGG AACAAAAGCG 
GGTTACTCAA AAGAACAAGA GTCAATGGTA 
GGAACCCACC AAACCAACCA ATCAACCAAA 
ATTTTGTCTT TTTGCATTTT GGTGTATAAG 
TTTAAAAAAA AAAATGTGGA GGAAAGTAGA 
AAATTCACAG ATTTTTTTAA CGAGAAAAAC 
CCTCAGCACC TTGCTCTTGT GTTTCCCTTA 
TTTTTTATTT TTTTAATAAA AATGAGTTGG 
GAGAAGGTGA CAGTCCAAGT GTGCAACAGC 
CCNATATGGC CTTCTTTTGG ACAAACCTTG 



ATAACAATCT CATCTTTGCA GGATTTGAGC 1020 
AAGGGCACAG TTGAGGCCTG TGCCAGTGCT 1080 
GCCTTTGAAA ATGATATGCT GGCTGTTAAC 1140 
CCCCATCACC AGTTTGGCCC GTTCCCGCAT 1200 
AATCTCTTCA TCCCAACCCA GGCTGTGGGC 1260 
AAACAGCTGG CGAGATTCGC CGGAGCCTCT 1320 
GTCAGCGAAA GGATGGTCAT CATCACCGGG .1380 
CGGATCTTTG GGAAACTGAA AGAGGAAAAC 1440 
GAAGCGCATA TCAGAGTGCC CTCTTCCACA 1500 
ACCGTGAACG AACTGCAGAA CTTAACCAGT 1560 
CCAGATGAAA ATGAGGAAGT GATCGTCAGA 1620 
GCACAGCGCA AGATCAGGGA AATTGTACAA 1680 
CAGGGAGTCG CCTCACAGCG CAGCAAGTGA 1740 
TGAATGTAGC CCTTCCAACA CCTGACAGAA 1800 
GCAAACCAAA GACCATCTGA GGAATGAGAA 1860 
GGCCCTGAGA ACCCCAGGGG CCGAGGAGGG 1920 
CACCGAGCCC CGCCTCCCGC CCCCCAGGGC 1980 
TCCACTCGGA TCTCTCCTGA ACTCCCACGA 2040 
TGAACGTGTT CAAAGCCAAG CAAAATGCAC 2100 
CATGTGTGTA CATATTAGAA AGGGAAGATG 2160 
GTGCCTGCAG CGGTAATATA TTTTAGAAAT 2220 
TTTTAATCAA TTATTAATTT TTTTTTCTTT 2280 
AAAGAATAAA GTCTTTGGGA GGTCTCACGG 2340 
AAAATTCACC CAGAGGGAAA TCTCGTCGGA 2400 
TGTATGTCAA CAGAAGGGAT ACCGTCTCCT 2460 
TGTCTAGCTC ATACACCCAT TTCTCTTTGC 2520 
CTGCTATATA ATTCTCTGTC TCTCTCTGTT 2580 
CTCCATCTCC ATTCTTTTGA ATTTCCTCAT 2640 
CCCCCCCCCC CAGGCAAAGC AGTGCTCTGA 2700 
AAACACACAA ACCAGCCTCA ACTTACACTT 2760 
CTTGTCCTAG CGTTTTGGAA GAGGAAAACA 2820 
CAAAGAAAAA ATTCCACAAT GAAAGAATGT 2880 
CCATCAATAT TCAGCAAAAT GATTCCTTTC 2940 
AATTTACCAA GGTTGTTGGC CCAGGGCGTT 3000 
ACACAGAAGA AGCTACCTCA GGTGTTTTTA 3060 
GAGATTTTGT AAAGCTGATA GTTGGAGCAT 3120 
AAAAAAAATA AGATATCAAC TGCCAGCCTG 3180 
TGTTCTGAAT TGTCTTCCGC TAGCCAAGAA 3240 
AAAATGTTTA TTT 3283 
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<210> 1 
<21l> 4265- 
<212> UNA 

<213> Hoa\o sapiens 

<220> 

<40Q> 1 

GTCTGAAGGR CCTGA^GCAT TTTCTG?iCGA GGATCGTCTC AGGTCAGCGG AGGGAGG&CA $0 
CTTJVTAGACC! TATCGA05TCT 'TCAACCTOCT CCAGAAftGCA GGAGTTGAAG ACCTGGGTGT 120 
GAGGGACftCA TACATC7CTRA AAGCACCACA GCAGAGGAGG CCCAGGCAGT GU<2*£GAGrC 180 
AAGGTTCCCft GMGACAASC CCCCTAGGAA GACAGGCGAC CTCTCfcGGCC CTRGAGCACC 240 
ACCTTAAGAG AAGAAGAGCT GTAAGCCGGC CTTTGTCAGA GCCATCATGG GGGACAAGGA 300 
TATGCCXACT CCTCGGATGC CGAGTCfTCT CCAGAGTTCC TCTGAGAGTC CTCAGAGU'JG 3 GO 
TQCTGAGGGG GAGGACTCCC AGTCrCCTCT- CCAGATTCCn CAGAGTTCTC CTGAG&SCGA 4 2D 
C&ACACCCTG TATCCTCTCC MSMVrcCTCA GAGTCGTTCT GA&GGGGAGC ACTCCTCGGA 480 
TCCTCTCCAG AGACCTCCTC AGGG&AAGGft CTCCCAGTCT CCTMCCAGA TTCCCCAGAG 540 
TTCTCCTGAG GGUGACGACA CCCAGTCTCD TCTCCAGAAT TCTURGAGIT CTCCTGAGGG 600 
GAAGGaCTrC CTGTCrCCTC TAGAG&TTTE TCAGAjGCCCT CCTGAGGGTG ACGATCTCCA E60 
GTCTCCTCTG CAGAATCCTG CGAGTTCCTT OrTCTCCTCT GCTTTATtGA CTATTTTCCA 720 
GAGETCCCCT ftAGriGTATTC AAAGTCCTXT TGAGGGTTTT CCCCACTCTG TTCTCCAGAT 70 0 
•I'OCTCTGAGC 6CCC5CCTCCT CCTCCACTTT AGTGA&fATT TTCCASAGTT CCCCTGAGA6 640 
TACTCAAAGT CCTTTTG&GG GTTTTCCCCft GTCTCCaCTC CAGATTCCTG TGAGCCCCTC $00 
Crt'CTCCTCC ACTTTATTGA GTATTTOCCA GflGTTCCCCT GAGAGMGTC AGAGAACTTC %0 
TCAGGCTTTT CCACAGTCTC GTCTCCAGAT TCCTCTGAtiC TGCTCCTCGT CCTCCACTTT 1020 
ACTGAGTCT? TTCCAGA^TT CCCCTGAGAG AACTCAGAG'T ACmTGAGG GrmCCCCA 10$D 
GTCTCCACTC CAGATTCCTG TGAGCCGCTC CTTCTCCTCC ACTTTATTOA GTATTTTCCA 1140 
GAGTTCCOCTP GAGAGAACTC AGAGIACTT? TGAGGGTTTT GCCCAGTCTC CTCTCCAGAT 1200 
TCCTGTGAGC OCCTCCTTCT CCTCCACm ACTG^GTATT TTCCAGAGTT OCCCTGAGAG 1260 
JWTraGAST ACTTTTGAGG GTTTTGCCCA GTCTCCTGTC CAGATTCCTG TGAGCTCCTC 1320 
CTTCTCCTCG ACrfTATTCA GTCTTTTCCA GAGTTCCCCT UAGAGAACTC AGAGTACTTT 1380 
TGAGGGrTTT CCCCAGTCTC CTCTCCA&AT TCCTCGAAGC CCCTCCTTCT CCTCGACm 1440 
ACTGAGTCZTT TTCCA&AGTT CCCCTGAGAG AACTCACAGT ACTOTGAGG GTTTTCCCCA 1500 
GTCTCCTCTC CAGATTCCTA TGACCTCHTC CTTCTCCTCT ACTMATTGA GTATITTACA 1560 
GAGTTCrCCT G^GAGTGCrc AAAGTGCTTT TGAGGGTTTT CCXXiAG'fCTC CTdCCRGAT 1620 
TCCTGTGAGC TCCTCTTTCT CCTACACTT^ ATTGAGTCTTT TTCCAGAGTT CCCCTGAGAG 168D 
AACTCACAGT AdTTTGAGG GTTTTCCCCA GTCTGCTCTC CAGATTCCTG TGAGCTCCTC 1T4D 
CTCCTCCa'CC TCCACTTTOT TGAGTCTTTT CCAGAGfTCC CCTGftGTCTA CTCAAAGTAC 1800 
TTTTOAGGGl' 'mCCCCAGT CTCCTCTC5CA GATTGCTCAG AGTCCTCJTTG AAGGGGAGAA 1860 
TACCCATTCT CUTC'l^CACA TTOTTCCAAG TCTTCCTGAG TGGGAGGftCT CCCTGTCTCC 1920 
TCACTACTTT rnTCAGAGCC CTCCTCAGGG GGAGGACTCC CTATCTCCTO AGTA^TTCC 19 AO 
TCAGAGCCCT CCTCAGGGGG AGGACTCCCT GTCTCCT^C TACTTTCCTC AGROCCCTCA 20^0 
GGG6GAGGAC TCCCTGTCTC CTCACTACTT TCCTGAGAGC CCTCCTCAGG GGGAGGACTC 2100 
OlTGTCrCCT CTCTACTTTC CTCAGAGTCC TCT^CAGGGP GAOGAATTOC AGTCTTCTCT 2160 
CCAGAGCCCJ' GTGAGCATCT GCTCCTCCTC CACTCCATCC AGTCTTCCCC AGAGTTTCCC 222U 
TGAGAGTTCT CA&AGTCCTC CTGAGGGGCC TGTCCAGTCT CCTCTCCATA GTCCTCAGAG 22 BO 
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CCCTOCTGAG GGGATGCACT CCCAATCTCC TCTCCW3AOT CCXGAGAGTG CTCCtfGAGGG 2340 
GGAGGftTTCC CTGTCTCCTC TCCAAAT"fCC TCA6AGTCCT CTT^GGGAG AGGaCJTCCCT 2400 
CTCTTCTCTC CATTTTCCTC AGAGTDCTCC TGAGTGGGAG GACTCCCTCT UTOCtfCTCCA 2460 

ctttcctcag r^rccrcCTC ?i&sgggagga cticcagtct tctctqcaga gtcctgtgag 2520 

TATCTPCTCC TCCTCCACTT CTTTGAGtfCT TCCCCAGAGT TTCCCTGAGA GTCCTCAGAG 25B0 
TCCTCCTGAG G&GCCTCCTC AGTCTCDl'CT CCAGAGACCT GTOGCTCCT TCFXCTCCPA 2640 
CACTTT2VSCG AGTCTTCT'CC AMGTTCCCA TEAGAGTCCT CAGAGTCCTG CT&AGGGGCC 2700 
TGCCCAGTCT CCTCICCAGA GTCCTETE&G CttCCTTCCCC TCCTCCACTT CATCGAGTC'f 2760 
TTCCCAGAGT TCTCCVGTGA CCTCCTTCCC CTCCTCCACT TCATCGAGTC TTTCCAAGAG 2620 
TTCCCCTGAG AGTCCTCTCC AGAGTCCTGT GATGTCCTTC TCCTCCTCCA CTTCATTCAG 2B&0 
OCCATTCACT G^flGAGrCOA GCAGCCCAGT AGATGAAtAT ACAAGTTCCT CAGACACCTT 2S40 
GCTAGAGAGT GAFTCCnGA CAGACAGCGA RTCCFTGATA GAGAGCGAGC CCTTGTTC^C 5000 
TCATACACTG GAFGTiAAAGG I'GGACGAGTT* . GGCGCGGTTT CTTCTCCTCA AATATCAAGT 3060 
GAAGCAGCCT ATCACAAAGG CAGAGATGCT GAC6AATGTC ATCAGC&GGT ACACCGGCTA 3120 
CTTTOCTGTG ATCTTCAGGA AAGCCCGTGA GTTCATAGAG ATACTTTTTG GCATTTCCCT 31 $0 
GAGAGAAGTG GACCCTGATG ACTCCTATGT CTTTGWWYC ACATTAGAOC TCACCTCTGA 3240 
GGGGTISTC'rG AGTGATG7^ AGGGCATGTC CCAGArtCCGC CTCCTGArtC TTATTCTGAG 3300 
TATCATCTTC ATAAAGGGCT CCTATGCCTC TGAGGftGGTC OTCTGGGATG TGCTGAGTGG 3360 
AATAGGGGTG CGTGCTGGGA GGGAGCACTT TGCCTTTGGG GAGCCCAGGG AGCfCCTCAC 3420 
TfiftAGTTTGG GTGCAGGAAC ATTACCtfAGA GTACCGGGAG GTGGCCAACT UTTClCCTOC 34 BO 
TCGTTACGAA TTCCTCTGGG GTCCAAGAGC TCATTCAGAft GTCATTAAGA GGAAAG'l'AG'J' 3540 
AGAGTTTTTG GCCATGCTAA AGAATACCGT CCCTATTACC TTTCCJVTCCT CTTACAAGGA 3600 
TGCTTTGAAA GA'l'GTGGAAG AKAGAGCCCA GGCCATAATT GAGRCCACAG AFGATTCGAC 3660 
TCCCACAGAA agtGCRAGCT CCAGTGTCAT GTCCCCCAGC TTCTCTTCTG AGTGAAGTCT 3720 
AGGCCAGATT HTTCCCTCTC AGTXTGAAGG GGGCAGTCGA CTTTCTACGT GOTGGAGGGC 3780 
CTGGTTGAGG CTGGAGAGAA CACAGTGCTA TT'l'GCATTTC TGTTCCATAT GGGTAGTTAT 3840 
GGGGTTTACC TGTTTTAC1T TTGGGTJVTTT TTCAAAfGCT TTTCCTAOTA ATMCAGGTT &&0D 
TAWlTAGCTT 1 CAGAATCCTA GmATGCAC ATGAGTCGCA CATG!TATTGC TGTTTTTCTG 3960 
GTTTAAGAGT AAGAGTTTGA 'MTTTTGrAA AAACAAAAAC ACACCDARAC ACACCACATT 4020 

gggaaaajgct tctgcctcat tttgtgjvtct gtcacaggtt aatgtggtgt oactgtagga 40DO 

ATTTTCTTGA AACTGTGAAG GAACTCTGCA GTTAAWAjGT GGAATAAAGT AAAGGATTCT 4140 
TAATGTTTGC ATTTCCTCAG GTCCTTTAGT CTGTTGTTCT TGAAAACTAA AGAPACATAC 4200 
CTGGTTTGCr TGGCTTACGT AAGAAAGTCG AAGAAAGXAA ACTCTAATAA ATAAAACTCT 4260 

cmm 4i2$t> 

<210> 2 

<211> 1142 

c212> PHV 

<213> Homo aapiea^ 

<220> 

<400> 2 
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Pro 
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Glu 
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Asp 




Gin Sex 


Pro 


LOU 
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Gin 
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Asp 
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Gin Ser 
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Leu Sex Pxo LOU Gin H* Pro Gin Ser Pro LOU Glu Gly Glu Asp Ser 

690 G95 70D 

Leu B x Ser Leu His Phe Pro Gin Ser Pro Pr*i> filu Trp Glu Asp Ser 
705 710 71b 720 

Leu Sex Pro Leu Ri* Phft Pro Gin L'he Pro Pro QLp Gly Glu Asp Phe 

725 730 735 

Gin Ser Ser Leu Clii S$v Pro Val Ser II Cys Sor Ser Ser Thr Sex 

740 745 750 

Lou Set Leu Pro Gin Phft Pro Glu Ser Pro GIji SO* Pro Pro Glu . 

755 760 76$ 

Gly Pro Ala Gin Ser Pro Ifcu Gin Arg Pro Val Bet Ser Phfc Phe Ser 

'70 775 780 

Tyr Thr leu Ala tfer Leu Luil Gin Ser Ser Sis Glu Ser Pro Gin Ser 
785 790 795 BOO 

Pro Pro Glu Gly t>ro Ala Gin Pro Leu Gin Ser Pro Val Ser Sex 

B05 B10 815 

Phe Pro Sfcr Der Thr Ser Sor Ser Leu Ser Gin Ser Ser Pro Val Ser 

620 62S £30 

Ser Pho Pro Bex Ser Thr Scr Ser 5ar leu Ser Lys Ser Ser Pro Glu 

MS G40 845 

Ser Pro T.^u Rln Ser i*ro Veil lie &*>r Phe Rer Ser Ber Thr Lou 

S50 B55- 860 

Ser Pro Phu Star Glu Glu Ser Ser Ser Pj/O Val Asp Glu Tyr Thr Sex 
BG5 870 875 680 

£er Ser Asp Thr Lou Leu Glu Ser Asp Ser Luu Thr Asp Ser Glu Ser 

MS 890 B95 

leu tie Glu Ser Glu Pro leu Phe Thr 'J'yr Thr lou Asp Glu Lya Val 

900 905 910 

Asp Glu leu Ala Arg Pho J.&u leu Leu Lya Tyr Gin V«l I.ys Gin Pro 

G15 920 925 

Ilo Thr lya Ala Glu Wet LcU Thr flsn Val lie Ser Arq Ty* Thr Gly 

930 933 340 

Tyr PhC Pro Val lie Phe Ar<g J.yft Ma Arg Glu Phe lie Glu He leu 
945 950 955. 960 

Phe Cly Lie Ser Leu Arg Glu Val Asp Asp Asd Set Tv^ val Phe 

9G5 <*70 ' §75 

Val Asn Thr L<W Asp Leu Thr Ser Gly GJy Cys Leu Ser Asp Glu Gin 

960 3V5 990 

Gly Wet Ser Clrt nan Arg leu Leu 21a Leu Tie Leu Sex Ilu lie Phe 

995 1000 1005 

He Ly3 Gly Thr Tyr Ala iter Glu Glu Val Tift Trp Asp Val LOU 

101 D 1015 1020 

Gly He Gly Val Arg Mb Gly Arg Glu Bis PhO Ala Phe Gly Glu Vro 
102D 1030 IPS* H140 

Arg Glu Leu Leu Thr l.ys Val Trp Val Clcs GJ U His Tyr Leu Clu Tyr 

10^ 1050 1055 

Arg Glu Val i J xo Asn Sfcr Ser Pro Pro Arg Tyr Glu Phe Leu Trp Gly 

1060 1065 1D70 

Pro Ar>q Ala Hi a Ser GlO Val lie Lys Arg Lys V& I Val Glu Phe luw 

1.075 10BO 1083 

Ala MO I Lfin LyB Aan Thx Val Pro He 'J'hx Phe Pro Sfcr Ser Tyr Lys 

10$0 1095 110O 

Asp Ala luu lys Rap Val Glm Glu Axq Ala Gin Ala lie He Asp Thr 
1105 1H0 1L15 1120 

Thr Asp A3p Sur Thr Ala Thx Clu .Scr Ma Ser Ser Ser Val Met Ser 

1125 lliO 1135 

Pro I?fcx rhe Sex Ser Glu 
1140 
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<210> 4 
<211> 4159 
<212> DNA 

<213> RortO .^piena 

<220> 

<400> 4 

GGTGGATGCG TTTGGETTGT AGCTAGGCTT TTTCTTTTCT T'i'CtfCTTTTA AAACACATCT 60 
AGftCAAGGAA AAAACAftGOC TCGGATCTGA TTTTTCnCTC CTOtfJl'CTTG TGCTTGCTTC 120 

ttacrgtgtt tctgtotttt aaaggcgaga agacgagggg ftagaaaacca gctggatcga 100 

tccatcaccg tgggtggttt u'aatttttcc ttttttctcg ttatttt'f'lt ttaaftcaacc 240 

actcttcaca atgaacftaac tgtatatcgc aaacctcagc ga&aacgocg ccccctcgga 300 

cctagaaag'f atcttcaagg acfccoaagat cccggtotcg toanccttou tgotgaagac 360 

tggctacgc6 ttcctggact goccggacga ga-gctgggcc ctcmggcca tcgaggcgct 420 

ttgagqtaaa atagaactgc acrggaaacc catagaagtt gagcfictcgg toccaaaaag 480 

gcaaaggatt cggaaacttc acttaogaaa tatcocgcct catttacagt gggaggtgct 54 d 

ggatagtita cragfccagt atg6agtggt ggagagctgt gagcamtca acactgacl'c god 

ggaaactgca cvttgtaaatg taacctattc cagtaaggac caagctagnc aagcftctaga €60 

CAAACTGSAT ggatttcagt tagagaattt CACC2TGA&A GTAGCCTATA TCCCTGATGA. 720 

AATGGCCGCC CAGCAAAACC CCTTGCAGCA GCCCCGAGGT CGCCGGGGGH 1TGGGCAGAG ?B0 

GGGCTCCTCA AGGCAGGGG? CTCCAGGATC CGTAl'CCAAG CAGAAACCflT <?TGATTTGCC 840 

TCTGCGCCTG CTGf*TTCCCA CCCAATTTCT TGGftGCCKTC ATAGGAAAAG AAGGTGCCAC 900 

CATTCGGAAC ATCACCAAA0 AGACCCfcCpTC TAAAATOGAT GTCCACCGTA AAGAAAATGC D60 

GGGGGCTCCT GAGAAGTC&A TTACTATCCT CTCTACfCCT GAAGCCACCT CTGOGGCTrfi 1D20 

TAAGTCTATT CTGO&GATTA TGCATAAGGA AjQCICAAGAT ATAAAATTCA CWlAEAGAT 1030 

cxxjgttgaag attttagctc ataataactt tgxtggacgt cjtattggta aagaaggaag 1140 

aaatcrtaaa aaaatt3agc aagacacaga cagtaaaatc acgatatctc cattgcagga 1200 

atxgacgctg tataatcc^g aacgcactat tacwttaaa ggcaatgttc agac7\t(3tgc 126d 

caa^gctgag gaggagatca tgaagaaaat caggoagtct tatgaaaatg atattgcttc 132d 

tatgaatctt caagcagatt taattcctgg attaafttctg aacgcctftgg gtctgttccc 13bd 

acccacttca gggatgccac ctcccacctc agggccccct tcagccfitga ctcctccctft 1440 

cccccagttt gagcaatcag aaacggagac tgttcatcag tttatgccag ctctatcagt 150d 

cggtoccatc atcggcaagc agggccagca catcaagcag ctttctggct ttgctggagc 1560 

TTCAATTARG ATTGCTCCAG CGGAAGCACC AGATGCTWiA GTGAGGATGG TGATTATCW 1620 
TGGACCACCA GAGGCTCAGT TCAAGGCTCA GGGAAGAATT TATGGAAAAA TTAAAGAAGA 16fi0 
AAACTTTGTT AGTCCTAAAC AAGAGGTGAA ACTTGAAGCT CATATCAGAG TGCCATCCTT 1740 
TGCTGCTGGC AGnGTTATTG GAAAAGGAGG CAAAACGGTG AATGAACTTC AGAATTTGTC lfiOO 
AAGTGCAGAA GTTGTTGTCC CTCGTGACCA GACACCTGAT GAGAATGACC AAGTCGTTX3T 16^0 
CAAAATAAC7 GGTCACTXCT AT^CTTOCCR GGTTGCCCAG AGAAAAATTG AGGAAATTCT 1$20 
<3ACTCAGG!TA SACCAGCACC AACR?tCAGflA GGGTCTGCAA AGTGGACDAU CTCAGTCAAG .1$$0 
ACGGAAGTAA aggcxcagga aacagcccac cacagaggca gatgccaaac CAAAGACAGA 2040 
TTGCTTAACC AACAGATGGCi CGCTGftCCCC CTATCCAGAA TCACATGCAC AAGTTTOTAC 21.00 
CTAGCCPW3TT GUTTCIQhQ^ ACCAGCCAAC TTTTGAACTC CTGTCTCTGT GAGAATGTAT 2160 
ACTTTATGCT CTCTGAAATG TATGACACOC AGCTTTAAAA CAAACAMC?t AACAAACAAA 2220 
AAAAGGCTGQ GGGAGGGACG GAAAfiAGAAG AGCTCTGCAC TTCSCCTTTGT TGTAGTCtCA 2280 
CAGTATAACA GATATTCTAA TTCTTCTTAA TATTCCCCHA TAAO'GCCAGA AATTGGCTTA 2340 
ATGATGCTTT CACTAAA'WC ATCAAATAGA TTGCTCCT^A ATCCAA'fTGT TAAAATTGGA 2400 
TCAGAATAAT TATC^CAGGA ACTTAAATGT TAAGCCATTft GCATAGaAAA ACTGTXCTCA 2460 
GTTTTATTTT TACCTAACAC TAACATOAGT AACCTAAGGG AAGTGCTGAA TGCTGT^GGC 2520 
AGGGGTAtTA AACGTGCATr TTTACTCAAC TACCTCAGGT ATTCAGTAAT ACAATGAAAA 2SBD 
GCAAMTTGT TCCTTTTTTr TGAAAATTTT ATATACTTTA TAAT6ATAGA ACTCCAACCG 2640 
TTTTTTAAAA AATAAATTTA AAATTTAACA GCAATCAGCT AACAGGGAAA TTAAGATTTT 27 00 
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TACTTCTGGC TGGTOACACT AAAGCTGGAA 
ACACAjCsTTAT" TACTTAAATC AAATGTTCAA 
GCAGCACTAC CATTTMTCT TTCATTTA'i'A 
GTGG'i'CGCAG GAGMTTTGG ftAOGGCTGGT 
GTTTAGCTAC ATGATTGAAT GCA7AATAAA 
AGAAAG'i'GCA TCAGTCAAGA GAXGC&AGAC 
GCTTCTCTTA TAGGATGCTT agtttgccad 
TGGTOTGACA GTGTTTAAAC GCAACAAAAG 
GAGCCTCACT AAGCTATTTT GAAGATTTTT 
AW3TMACT CCACCTTAAG TAGTAAAGTA 
TOTCTTTGAAA AAAAAGTCAft AAGATAGAGA 
AATGACrGTG AAAACATATG acctttgata 
AAGCCCAGTA CGTACAATTG TGTTGG&TGT 
A&TCGATTTT TTGAGTTTTC GNTTGNAAGff 
GGftCArATNl* I'ATAACCCTT TAAAAWAAA 
ATTTOGATAC AGACTAGATG TCTTTCTGAA 
AGTOTTTTCC TTAATGTTCT CTGAAAACAA 
CCCTTTTTGT CACl'GGmC TCCTAGCAT? 
ATTGCTAAAA TCA1GGACTG GCTTTCTOOT 
GAGCTTTTCT CAGTATTTGA U^TTTTTTCCC 
AGGAGCTGCA TTTAAAAGCT UCiTGCTTTM 
GTATGCCflAA TCANAATTTA CTCTTACTTA 
GCTAAGAAAT AATTCR&TAA TTGAG*mTG 
ATAATGTOCC CCCAWGCAG CTTCArJf'I'C 
ATCaTTTAGG tccccaaaa 



FCT/US99/037W 

■ 6 

AATTAATTTC AGGGTTTTTT GAGGCTTTTG 27 60 
AAATACGGftG CAGffGCCTAG XATCTGGAGA 2820 
GTTGGGAAAG TTFTTGACGG TACTAACAAA 28B0 
TTAAATGGCT TCAlaGAGACT TCAGTTTTTT 2940 
'fGCTTTGTGC TTCIGACTAT CftAT.AECTAA 3000 
TTTCAACTGA CTGGCAAAAA GCftAGHTTTA 30 60 
TACACTTCAG ACHAATGGGA CAGTWTAGA 5120 
GCTACATraC CATGGGGCCA GCACTGTCAT 3160 
AAGCACTGAT AAATTAAAAA A^JiAAAAAA-3240 
TAACAGGATT TCTGTAl'ACT GTGCAATCAG 3300 
ATACAAGAAA AGTTTTtiGGG ATATAATTTfi 3S«0 
ACGAACTCAT TTGCTGftCFC CTTGACA5CA 3420 
GGGTGGTCTC CMGBCCiACG CTCCTCTCTG 34$0 
TGATCACAGN CATGTTACAC TGATCTTOAa .1540 
ATCOOCTCCC TCATTCTTAT TTCGAGATGA 3£QfJ 
KATCAATTM 7\CATTHTGAA AATGATTTAA 3660 
GTT'fCTTTTC TAGTTTVAAC CAAAAAAGTG 3720 
CATGAtTTTO TTTTCACACA ATGAATTAAA 3760 
TGGATTTCAG GTAAGATGTG TTTAAGGCCA 3B40 
CAATATTTGA TTTTTTAAAA ATATACACAT 3D0O 
ATTCTGTCAH ATTTCACTTC TAGCCTTTTA 3960 
AGCATTTOTA ATTTGGAGTA TCTGGTACTA 4020 
TACTC^CCAA AMATGGGTCA TTCCTCATGN 4QB0 
CAGANACCTT GAOGCAGGAT AAATTTTTTC 4140 

4153 



<21D> 5 

<2L1> 170B 

<2L2> DC* A 

<2L3> Horoo sapiens 

<22Q> 

<400> 5 



AG5GACGDTG CCGCACCGCC CCAGTTTAC.C 
CCAGTTGEAG AACCttfGOCC TGAAGCTCTC 

acctgagaat gggogccgag ggggctttgg 
tgtggcagcg ggggccocftg ccaagcagca 
gcccacccag tatctgggu'g ccastattog 
aaaacagaoc cagtccaaga l'agagctgca 
acccatcajgt gtocactcca cccctgagsg 
gattatgcat aaagaogl'ta aggacaocaa 

GGCCCATAAT AACTTTGTAG GGCGTCTCAT 
AGAGCAAGAT AfXGAGACAA AAATCAGCAT 
OCCTGAGAGG ACCA7T-ACTG TGAAGGGGCC 
AATAATGAAG AAAGTTCGGR AGGCCTATGA 
GATCOCTGGC CTGAACCTGI5 CT6CTGTAGG 
GCCGCCTCCC AGCAGCGTTA CTGOSGCTGC 
GCAOOAGATG GTCCAGCTGT TTATCCCCGC 
GGGGCAGGAC ATCAAACAGC TCTCCCGCT1* 
CGAAACACCT GACTCCAAAG TTCCTATGGT 
CAA<^CTCAG GGAAGAATCT ATGGCAAACT 
GGAAGTGAAG CTttoGACCC ACATACGTGT 
CAAAGGTGGA AAAACGGTGA ACGAGTTGUR 
AA<3AGACrAG ACCCCTGA'l^ AGAACWCCA 
TCCCAGTCW ATGGCTCAAC GGAAQATCCG 
TCAGAAGGGA CAGACTAACC AGGCCCAGGO 
CTTNGAGTCC AGGACAACAA CGGGCAGAAA 
GAATGAGTGC GAATCCGGGA CACWTGGGCC 
GAGAAAGATG TTCCAGTGAG GAACCCTGAT 
CCAACACTGT NTGCCCCTCG GGGtGTCAGA 
ATTGTTTAAA GAAGCTCTCC AGGCCCCACC 
AAAATAAAAT TTCCTTCAGG 7TTTAAAA 



OCGGGGAGCC ATCATGAAGC TGAATGOCCA ftO 
CTACATCCCC GATGAGCAGA TAGCACAGGCi 120 
CTCTOGGGCT CAGCCCCGCC ACGGC!TCACC ISO 
GCAAGTGGAC ATCCUCCTTC GGCTCCTGGT 24 Q 
CAAQ&AGGGC GCCACCATCC GCAACATCAC 300 
TAGGAflGGAC ^ACGCAGGTG CAGCTG^^A 360 
CTGCTCCTCC GCTTGTAAGA TCATCTTGRA 420 
ftACGGCTGAC GAGGTTCCCC TCAAGATCCT 4 BO 
TGGCAAGGAA GGACGGAAOC TGAAGAACWT faiO 
CTTCTCOTC CMGACCTl l A CCOTTTACAA 600 
CATCGAC5AAT TGTTGCAGGG CCGACCA6GA 6^0 
GAATGATGTG GCTGCCATGA GCTCTCACCT 
TCTTTTCCCA GCTTCATCCA -GCGCAGTCCC 7$0 
TCCCTATAGC TCCTTTATGC AGGCTCCCGA 
CCAGGCAGTG GGCCCCA'ICA TCGGCAAGAA VQO 
DGCCAGCGCX: TCCATCAAGfr TTGCACCACC 960 
TATCATCACT GGACCGCGAG AGGCCCAATX 1020 
CAAGGAGGAG AACTTCTTTG GTCCCAAGGA 10B0- 
GCCAGCATTA GCAGCTGGCC GGGTCaTTGG 1140 
GAATTTGACG GCAGCTGAGf? TGGtAGTACC 1200 
G&TC&TCUTG AftAATCATCG GACATTTCTA 1260 
AGACATCCTG GCOCAGGTTA AGCAGCAGCA 1320 
ACGGAGGAAG TGACCAGCCC CTCCCTGICC 1380 
TCGAGAGTGT GCTCTCOCCG GCAGGCCTGA 144 0 
GGGCTGTAGA TCAESGTWGC CCACTTG7LTT 1500 
CTMTCAGCCC CAAAGACCCA CCCAATTGGC 1560 
AATTNTAGCG CAAGGCACTT TTAMGGTGti 1620 
AAGAGGC^TGG ATCACACCTC AGTGGGAAGA L680 

noa 
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<2L0> 6 
<2L1> 3412 
<2L2> DNA 

<2i3> homo sapiens 

<22Q> 

<40D> 6 

ggcagcgga& gaggcgagg^ gcgccgggta 
aagftgacgga '1'gatgaacaa gctttacavc 
gacctocggc aGCTCTTTGG ggagaggaag 

rcCGGCTAOt* CCJTCCTGGA nTAOCCCGAC 
CTCTCGGGTA AAGTGGAATT GCATOGGAAft 
AAGCTAAGGA GCAGGAAAAT TCAGAT'I'OGA 
TTGGATGGAC T'1*TTGGCTCA ATATGGGftCA 
ACAGAAACCG CCGTTGTCAA CGTCACAl % A'f 
GAGAAGCTAA GCGGGCATCA GTTTGAGAAC 
GftAGAGGTGA GCTOCCCTTC GOCCCCi^CAG 
GAGCAAGGCC ACGCCCCTOG GGECACJtt'CT 
ATCCTGGTCC CCACCCAGTT TGTTTGGTGCC 
AACATCACTA AGCAGACCCA GTGCCGGGTA 
GCAGAGAAGC CTGTCACCAT CCATGCCAGC 
ATTCTTGAAA TCATGCAGAA AGAGGCAGAT 

aaaatcttgg hacacaatgg cttggttgga 
aagaaaattg aacatgaaac agggaccaag 

ATATACAACC CGGAAAGAAjC CATCACTGTG 
GAGATAjGAGA TTATGAAGAA gctgcctgag 
CAfcCAAGCCA ATCTGATCCC AGGGTTGAAC 
CTGtCCGTGC TATCTCCACC AGCAGGGCCC 
CCUT'i'CACTA CCCAC7CCG6 Al'ACTTCTCC 
TTCCOCiCATC ATCACTCTTA TCCAGAGCAG 
GCTGTGGGCG CCATCATCGG GftAGAAGGGG 
GGAGCCl'CTA TCAAGATTGC CCCTGCGGAA 
ATCACCGGGC CACCCGAAGG CCAGTTCAAG 
GAGGAAAAC'f TCTTIAACCC CAAAGAAGAA 
TCTTCCACAG CTGGCCGGGT GATTGGCAAA 
TTAACCAGTG CAGAAGTCAT Otfl'GCCTCCT 

atcgtcagaa ttatcgggca cttctttgct 
attgtacaac aggtgaa5ca gcaggagcag 
agcaagtgag gctcccacag gcaccagcaa 
cpgacagaar gagaccaaac gcagccagoc 
gaatgagaajg tctgcggagg cggocaggga 
cqaggagggg cggggaaggt cagccaggtt 
ccccagggct tcigcaggct tcagccatcc 

CTCCCACGAi: GCfATCCCTO TTAGl"i'GAAC 
A^ATGCACA CCCTTTTTCT GTGfcCAAATC 
GGGAAGATCT TAAGATATOT GGCCTG'fGGG 
TTTAGAMVTA ATATAtCAAA TAACTCAAOT 
TTTTTCTTTT TAAAGAGAAA GCAGGCTITT 
GTCTCACGGT GTAGAGAGGA GCTTTGAGGC 
CTCCTCGGAA GGACAjCtCAC GGCAGTTCTG 
CCGTCTCCTT GAAGAGG&AA CTCT^TCACT 
TCTCTTTGCT TCACAGGffTT TAAACIGGTT 
LTCTCTGTTT ATCTCTCCCC TCCCTCCCCT 
TTTGCrCATC CCTCCATCTC AATCCCGTAT 
GTGCTCTGAG TATCACATCA CACAAAAGGA 
CJTACACTTG GTTACTCAAA AGAACAAGAG 
AGGAAAACAG GAACOCACCA AAOCAACCAA 
AAAGAATGTA TTTi'GTCTTT TTGCATTTTC 
ATTCCTTTCT TTAAAAAAAA AAAlGTGGAG 
CAGGGCGTTA AATTCACAGA TTTTTTTAAC 
GTGTTTTEAC CTCAGCACCT TGCl'CTOGTG 
TTGGAGCATT TTTTfATTTT TTTAATAAAA 
GCCAGCCTGG AGAAGGTGAC AGl'CCAAGTG 
AGCCAAGAAC CNATATGGCC TTCTTTTGGA 



CCGGGCCGGG G&AGCOGCGC3 GCTCTOGGGG GD 
GGGAACCTGA GCCCOGCCGT CACCGCCGAC 120 
CTGCCCCTGG CGGGACA5GT CCTGCTGAAG 180 
CAGAACTCGG CCTTCCGCGC CATCGAGACC 240 
ATCSATC6AAG TTGATTACTC AGTCT'CTAAA 300 
AACATCCCTC CTCACCTGCA GTGGGAGGTG 3G0 
GTGGAGAATC TGGAACAAGr CAACACAGAC 420 
GCAACAAGAG AAGAAGCAAA AATAGCCA'i'G 460 
TAC?ICCTTCA AGAjSTTCCTA CATCCCGGAt 540 
CGAGCCCACC GTGGGGAOCA CTC^ICCCGG 600 
CAGGCCAGAC AGATTGATTT CCOfeCTGCGG 660 
ATCATCGGAA AGGAGGGCTT GACCATRAAG 720 
GATATCCATA GAAAAGAGAA CTCTGGAjGCT 780 
OCAGAGGGGA CTTCTGAAGC ATGCCGCATG B40 
C^toCCAAAC TAGCCGAAGA GATTCGTCTG ftOO 
AGACTGA'J'TG GAAAAGAACG CAGAAATTTG 
ATAACAATCT CAlfCTTTGCA CGATTTGAGC 1020 
A^GGGCACAG 2TGAGGCCTG TGCCAGrCCT 1TO0 
GCCTTT6AAA ATCSATA'fGCT GGCTGTTAAC 1140 
CTCAGCGCAC TTGGCATCTT TTCAACACGA 1200 
CGCGGAGCTC nGCCOGCU'GC CCCCTACCAC 1260 
AGCCTGTACC GCCATCAGCA GTTTGGCCCG 1320 
GAGATTGTGA ATCTCXTCAT CCCAACCCAG 13U0 
CCACACfiTCA AACAGCTGGC GAGATOCCCC 1440 
GCCCCAGACG TCRGCGAAAG GATGCTCATC 1500 
GOCCAGGGAH GGAXCTTTGG GAAACTCAAA 1560 
CTGAAGCTOG AAGCGCATAT CAGACTGCCC 1620 
CCTGGCAAGA CCGtfGAACGA ACTCCAGAAC 1S80 
GACCAAACGC CAGATGAAAA TGAGGAAGTG 
AGCCAGACTG CACAGCGCAA GATCAGGGAA JJ00 
AAATACCCTC AGGGAGl^CGC CTCACAGC6C 16^0 
AACAACGGAT GAATGtAGCC CTOCCAACAC .1^0 
AGATCGGGAG CAAACCAAAG ACCATCTOAG 1980 
CTCTCCCGAG fiGCCTGAGAA CCCCAGCGCC 2040 
TCCCAGAACC ACCGAGCCCC GCCTCCCGCC 
ACTTCACCAT CCACTCGGAT CTCTCCTGAA 2160 
TAACATAGGT GAACGTGTTC AAAGOCAAGC ii^iO 
GTCTCTCTAC ATGTCXGTAC ATATTAGAAA 22(50 
TTACACAGGG TGOCTGCAGC GGTAATATAT 2340 
AACTCCAAIT TTTAATCAAY TATTAATTTT 2400 
CTAGACTTTA AAGAATAAA6 TCTTTCGGAG Z4*>0 
CACCCGCACA AAATTCAOCC AGAGGGAAAT 25S0 
GATCACCTCT GTATGTCAAG AGAAGGGATA 2b60 
CCl'CATGCCT CTCTAGCrrr-ft TACACCCATT 26^40 
TTTTGCATAC !I>GCTATATAA TTCTCTGTCT 2700 
nCGCTTCTTC TCGATCTCC^ TXCTTl'TGAA 2760 
CTACGCACCC CCCCCCCCCC AGGCAAAGCA 2020 
ACAAAAGCGA AACACACAAA CCAGCCTCAA 2880 
TCAitTGGTAC 'JTGTCCTAGC GTTTTGGAAG 2940 
TCAACCAAAC AAAGAAAAAA TTCCACAATG' 300 0 
GTGTATAAGC CATCAATATP CAGCAAAATG 3060 
GAAAGTKGAA ATTTACCAAG GTTGTTGGcC 3120 
CAGAAAAACA CACAGAAGAA GCTACCTCAG 31B0 

mcccTTAG agattttgta aagctgatag 3240 

ATGAGTTCGA AAAAAAATAA GATATCAAC1' 3300 
TGCAACAGCT G'fTCTGAATT GTCTTCCGC1' 3360 
CAAACCTTGA AAATCTCTAT VT 3412 
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"<210> 7 
<211> l$A6 
<212> DTO 

<213> flamo sapiens 

<220> 

<;400> 7 

GCTGTAGCGG MGGGC'fGGG GGCCTGCTCT GTCCCCTrcc TTGCGCGCTG CGGCCTCAGC 60 
CCACCCAGAG GCCGGGGTGG GAGGGCGACT GCTCAGCTTC CCGGGTTAGG AGCCGGAAAA 120 
TFCAAATCCG ftAATATTCCA CCCCAGCTCC GATGGGAAGT ACTGGACAGC crcCTGGCTC 1.$0 
AGTATGGTAC AGTAGAG&AC TGTGAGCAAG TGAftCACCGJL GAGTGAGACG GCAGTGGTGA 240 
ATGTCACCTfl TTHCAACCGG GAGCAGACCA GGCAAGCCAT CATGAACCTG AATGGCCAEC 300 
AGTTGGAGAA CCATGOCCTG AAGGTCTCCT ACOTQDCCGA TGAGCAGATA GCAC&GGGAC 360 
CIGAGAATGG GCGCCGAGGG G^CTTTGGCT CTCGGGGTCA GCCCCGCCAG GGCTCACCTG 420 
TGGCAGCGGG GGCCCCAGCG AAGCAGCAGC AAGTGGftCAT CCCCCTCCGG CTCCTQGTGC 460 
CCACCCACTA TOTGGGTGCt ATTATTGGCA AGGAGGGGGC CACCATCCCC AACATCACAA 5*0 
AACAGACCCA GTCCAAGATA GACGTGCATA GGAAGGAGAA CGCAGGTGCA SCTGAAAAAG 600 
CCATCAGTGT GCACTCCAGC CCTGAGGGCT GCTCCTCCGC TTGTAAGATG ATCTTGGAGA 6' 60 
TTATGCATAA AGAGGCTAAG GACACCAAAA CGGCTGACGA GGTTCCCCTG AAGATCCTGC 720 
CCCATAATAA CTTTOTAGGG CGTCTCATTG CCAAGGAAGG ACGGAACCTG AAGAAGGTAG 700 
AGCAAGA1AC CGAGACAAAA AT0AC5CATCT CCTCGTTGCA AGACCTTACC CTTTACAACC B40 
CTGAGAGGAC CATCACTGTG AAGGGGGCCA TCGAGAATTG TTGCAUGGOC GACCAGGAAA S0O 
TAASGAAGAA AGTTCGGGAG GCCTATGAGA ATGATCTGGC TGCCATGAGC TCTCACCTGA 9€Q 
TOCCTGGCCr GAACCTGGCT GCTGTAGGTC TTTTCCCAGH TTCATCCAGC GCAGTCCCGC 1020 
OGCCtCCCAG CAGCGTTACT GGGGOTGCTC CCTATAGCTC CTTTATGCAG GCTCCCGAGC 1080 
AG£AGATGGT GCAGGTGTTT ATOCCCGCCC AGGCAGTOGG WCCATCATC GGCAAGAAGG 2140 
GGCAGCACAT CAAACAGCTC TOQCGGTTTG CCAGCGCCTC CATCAAGATl* GCACCACCCG 1200 
AAACACCTGA CTCGAAAGTT CGTATGGTTA TCATCACTGG ACCGCCAGAG GCCCAATTCA 1260 
AGGCXCAGGG AAGAATCTAT GGCAAACrCA AGGAGGAGAA CTTCTTTGGT OCGAAGGAGG 1320 
AAGTGAAGCT GGAGACCCAC ATACGTGTGC CAGCATCAGC AGCTGGCCGG GTCATTGGCA 13B0 
AACGTEGAM AACGGTGAAC GAGTTGCAG& ArTTGACGGC AGCTGAGGTG GTAGTACCAA 1440 
GW3ACC2W3AG CCCTGATGAG AAGGACCAGG 'FCATCGTGAA AATCATCGGR CATTTCTATG 1500 
CCACTCAGAT GGCTCAACGG MGATOCGAG ACATCCTGGC CCAGGTTAAG OAGCAGGATC 1560 
AGAAGGGAUA GAGTAACCAG GCCCAGGCflC £GAGGAAGTG ACCAGCCCCT CCCTGTCCCT 1620 
TWGAETCCAG GACAACAAOG GGGAjGAAATC &AGAGTGTGC TCTCCCCGGC AGGCCTGAGA L680 
fcTSAGTGGGA ATCCGGGACA CNTGGGCCGG GCTGTAGATC AGGTTTGCCC ACTTGATTGA L740 
GflARGATGTT CCAGXGAGGA ACCCTGAfCT HTCAGCCCCA AACACCCACC CAATTG6CCC L80U 
WCACTGTNT GCOCCTCOGG fiTGTCAGAAA I'TWTAGCGCA ^GGCACTT1*T AAACCTGGAT I860 
TGTTTAAAGA AGCTCTCCAG GCODCACCAA GAGGGTGGAT CACACCTCaG TGGGAAGAAA 15*20 
MTAMATTT CCOTCAGGTT TTAAAA 1946 



<210> 0 
<2L1> 3283 
<2L2> CTA 

<2L3> HOimo sapiens 

<:220> 

<400> U 

GGCAGCGGAG GAOGCGACGA GCGCCGGGTA 
AAGAGACGGA TGATGAACAA GCTTTACMC 
GACCTCCGGC AGCTCTTrGG GGACAGGAW 
rCCGGCTACG CCTTCGTGEA CTACCCCGAC 
CXCTOGGGTA A^CTGGAATT GCATGGGAAA 
AAGCTAAGGA GCAGGAAfcAT TCAGATTCGA 
TTQGATGGAC TTTTGGCTCA ATATGGGACA 
ACAGAAACCG CCGTTGTCATi CG^CACATAT 
GAGAAGCTAA GCGGGCATCA KTTTGAGAAC 
GAAGAGGTGA GCTCCCCTTC GGCCOCTCAG 
GAGCAAGGCG ACGCCCCTGG GGGCACTTCT 
ATCCTGGTCC CCACCCAGTTT TGTTC^TGOC 
AACATCACTA AGCAGACCCA GTCOCGGGtA 
GCAGAGAAGC CTCTCADCAT CCATfiCCACC 
ATTCTTGAAA TCATGCAGAA AGAGGCAGAT 
AAAATCTTGG CACACAATGG CTTGGTtGGA 



CCGGGCCGGG GGAGrCGCGG GCTCTCCGGG 60 

GGGAACCTGA GCCCHGCOST CAOCGCCGAC 120 

CTQCCCCTGG CGGGACAGGT CCTGCTGAAG 1^0 

CAGAACTGGG CCATCCGOGC CATC GAG ACC 240 

ATCATGGAAG TTOATTACTC AGTCTCTAAA 300 

AACATOCCTC CTCAC^TGCA G'lVSGGAGGTG 360 

GFGGAGAATG TGGAACAART CAACACAGAC 420 

GCAAOAAGAG AAGAAGCMA AATAGOCATG 4U0 

TACTCCTTCA AGATTTCCTA CATCCCGGAT 540 

CGAGCCCAGC GTGGGGACCA CTCTTCCCGG 600 

CAGGCCAGAC AGATTGATTT CCCGCTGCGG 650 

ATCATCGGAA AGGAGGGCTP GACCATAAAG 720 

GATATCCATA GAAAAGACAA CTCTGGflGCT 780 

CCAGAGGGGA CTtCTGAAGC ATGCCGCA1G 840 

GAGACCAMC TAGCCGAAGA GATTCCTCTG 900 

AGACTGATTG GAAAAGAAGG CAGA&ATOPG S60 
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AAGAAAATTG AACATGAAAC AGGGADCMG 
ATATACAACC CGGAAAGAAC CATCACTGTG 
GAGATAGAGA TTATGAAGAA GCTGCGTGAG 
ACCCACTCCG GATACTTCTC CAGCCTGTAC 
CATCACTCTT ATCCJlGAGCA GGAGATTGTO 
GCC&TCATCG GGAAGAAGGG GGCACACATC 
ATCAAGAT7G CCCCTGCGGA AGGCCCAGAC 
CCACCCGAAG CCCAGTTCAA GGCCCAGGGA 
TTCTTTAACC CCAAAGMGR AGTGAAGCTG 
GCTGCCCGGG TGATTGGCAA AGGTGGCAAG 
GCAGAAGTCA TCGTGCCTC6 TGACCAAACG 
ATTATCGGGC ACTTCTXTGC 7AGCCAGACT 
CAGOTGAAGC AGCAGGAGCA GAAATACCCT 
GGCTCCCACA RGCncCAGCA AAACAACGGA 
TGAGACCAAA CGCAGCCAGC CAGATCGGGA 
GTCTCCGGAG GCGGHnAGGG ACTCTGCCGA 
GCGGGGAAGG TCAGCCAGGT tTGCCAGAAC 
TTCTGCAGGC TTCAGCCttTC CftCTTCACCA 
COCTATCCCT TTTAGTTGAA CTAACATAGG 
AOCCTTTTTC TCTGGCAfiAT CGTCtCTGTA 
TTAAGATATG TGGCCTGTGB GTTACACAG& 
AATATATCAA ATAACTCAAC TAACTCCAAT 
TTORAGAGaA AGCAGGCJTT TCTAGACTTT 

tgtagftgagg agctttgagg ccacccgcac 
aegacactca cggcagftct ogatcacctg 
tgaagfiggaa actctctcac tcctcatgcc 
ttcacaggtt ttaaactggt tttttgcata 

TATCl'CTCCC CTCCCT'CCCC TCCCCTTCTT 
CCCTCCATCT CAATCCCGTA TCTACGCACC 
GTATCACATC ACACAAAAGG AACAAAAGCG 
GGmCTCAA AACAACAAGA GTCAATGGTA 
GGAACCCACC AAACCAACCA /iTCAACCRAA 
ATTTTGTCTr TTTGCATTTT GGTGTATAAG 

tttaaaaaaa aaaatctgga ggaaagtaga 
aaattcacag atttttttaa cg3w3aaraac 
cctcagcacc ttgctcttct gtttccctta 
tttttvattt tttoaataaa aatgagttgg 
gagaftggtga cagtccaagt gtggaacagc 
ccnatatggc cttcttttgg tvcaaaccttg 



ATAACAATCT CATCTTTGCA GGA^TTGAGC 1020 
AAGGGCACAG TTGAGGCCSG TGCCAGTGCT 1OB0 
CCCTTTGAAA /VTGATATGC? GGCTGTTAAC 1140 
CCCCATCACC AGTTTGGCCC GTTCCCCCAT 1200 
AATCTCT7CA TCCCAACCCA GGCTGTOGGC 1260 
AAACAGCIGG CGA6ATTCGC CGGAGCCTCT 1320 
GTCAGCGAAA GGATGGMRT CATCACCGGG 1380 
CGGATCTTTG GGAAAGTGAA AGAGCAAAAC 14 40 
GAAGCGCATA TCAGAGTGCC CTCTTCCACA 1500 
ACCCTGAACG AACTGCAGAA CTTAACCAGT 1560 
CCAGATGAAA ATGAGGAAGT GATCGTCAGA 1620 
GCACAGCGCA flGATCAGGGA AATTGTACAA 1690 
GAGGGAGTCG CCTCACAGCG CAGCAAGTGA 1740 
MAAITGTAGC CCTTOCAACA CCTGAGAGAA 1800 
GCAAACCftAA GACCAtCTGA GGAATGAGAA I860 
GGCCCTGAGA ACOOfcAGGGG CCGAGGAGGG 1920 
CACCSAGCCC CGCCXCCCGC COCSCCAGGGC 1980 
TCCACTCGGA TCTCTCCTGA ACTCCCACGA 2040 
TGAACCTGT? CAAAGCCAAG CAAAATGCAC 2100 
C&TGTGTGTA CATATTW3AA AGGGAAGftTG 2160 
Gl'GCCTGCAG CGOTAATATA TTTTAGAAAT 2220 
TTTMATCAA TTATTAATTT TTTTTTCTTT 2280 
AAAGAATAAA GTCTTTGGGA GOTCTCACGG 23d 0 
AAAATTCACC CAGAGGGAAA TCTCGTCGGA 24 00 
TGTATGTCAA CAGAAGGGAT ACCGTCTCCT 24 60 
TGtfCTAGCTC ATACACCCAT TTCTCriTGC 2520 
CTGCTAMTA ATTCTCTOTC FCTCTCTGTT 25B0 
CJTCCATCTCC ATTCTTTTGA ATTTOCXCAT 2640 

occccccccc gaggc-ba^gc agtgctctga 2700 
aaacacacaa accawctca acttacacl'j' 2760 
citgtcctag cgt1txggaa gaggaraaca 2820 
caaagaaaaa attccacaat gaaa^saatgcd 2880 
ccatcaafat tca<5caaaat gattcctttc 2340 
aatttaccaa ggttgttggc ccagggcgtt ^ooo 
acacagaaga agctocctga ggpgttttta 3060 
gagattttct aaa^ctgata gttggagcat 3120 
aaaaaaaata agatatcaac tgocagcctg 3180 
tgttctgaat t<?rnttccgc tagccaagaa 3240 

AAAATGTTTA TOT 32tfJ 
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